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Preface 



The Symposium on Theoretical Aspects of Computer Science (STAGS) is held 
annually, alternating between France and Germany. The STAGS meetings are 
organized jointly by the Special Interest Group for Theoretical Computer Sci- 
ence of the Gesellschaft fiir Informatik (GI) in Germany and the Maison de 
I’Informatique et des Mathematiques Discretes (MIMD) in France. 

STAGS 2001 was the 18th in this series, held in Dresden, February 15-17, 
2001. Previous STAGS symposia took place in Paris (1984), Saarbriicken (1985), 
Orsay (1986), Passau (1987), Bordeaux (1988), Paderborn (1989), Rouen (1990), 
Hamburg (1991), Cachan (1992), Wurzburg (1993), Caen (1994), Miinchen (1995), 
Grenoble (1996), Liibeck (1997), Paris (1998), Trier (1999), and Lille (2000). It 
may be worth noting that in 2001 the symposium was held in one of the new 
states of reunited Germany for the first time. The proceedings of all of these 
symposia have been published in the Lecture Notes in Computer Science series 
of Springer-Verlag. 

STAGS has become one of the most important annual meetings in Europe 
for the theoretical computer science community. It covers a wide range of topics 
in the area of foundations of computer science: algorithms and data structures, 
automata and formal languages, computational and structural complexity, logic, 
verification, and current challenges. This year, 153 submissions were received, 
mostly in electronic form, from more than 30 countries, with a fair portion from 
non-European countries. We would like to thank Jochen Bern who designed 
the electronic submission procedure which performed marvelously and was of 
great help to the program committee. The program committee met for two 
days in Dresden and selected 46 out of the 153 submissions. Most of the papers 
were evaluated by four members of the program committee, partly with the 
assistance of subreferees. We thank the program committee for the thorough and 
careful work. Our gratitude extends to the numerous subreferees. The program 
committee was impressed by the high scientific quality of the submissions as 
well as the broad spectrum they covered. Because of the constraints imposed 
by the limited period of the symposium, a number of good papers could not be 
accepted. 

We thank the three invited speakers at this symposium, Julien Gassaigne 
(Marseille), Martin Grohe (Ghicago), and Dexter Kozen (Ithaca) for accepting 
our invitation to share their insights on new developments in their research areas. 

We would like to express our sincere gratitude to all the members of the 
Institut fur Theoretische Informatik and to the local organizing committee who 
invested their time and energy to organize this conference. 

We would like to acknowledge the various sources of financial support for 
STAGS 2001, especially the Deutsche Forschungsgemeinschaft (DFG), the Min- 
isterium fiir Wissenschaft und Kunst des Landes Sachsen, and Freunde und 
Forderer der TU Dresden. 
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Recurrence in Infinite Words 

(Extended Abstract) 



Julien Cassaigne 

Institut de Mathematiques de Luminy 
Case 907, F-13288 Marseille Cedex 9, France 
cassaigneOiml . univ-mrs . f r 



Abstract. We survey some results and problems related to the notion 
of recurrence for infinite words. 



1 Introduction 

The notion of recurrence comes from the theory of dynamical systems. A system 
T : X — > A is recurrent when any trajectory eventually returns arbitrarily near 
its starting point, or in more formal terms, when for any open subset U oi X 
and any x G U, there exists an integer n > 1 such that T"(x) € U. And if this 
n — the return time — can be chosen independently of x for a given [/, the 
system is said to be uniformly recurrent. 

When X = 0(u) is the subshift generated by an infinite word u, the recur- 
rence of X can be expressed as a combinatorial property of the word u. Moreover, 
it is possible to compute return times and this allows to quantify the speed of 
recurrence in the system, via the recurrence function of the word u. This point 
of view was initiated by Morse and Hedlund in their 1938 article on symbolic 
dynamics [13]. 

In this article, we survey some results and problems concerning recurrence 
in infinite words. 

2 Preliminaries 

Let A be a finite alphabet, with at least two elements. We denote by A* the 
set of finite words over A (i.e., the free monoid generated by A), including the 
empty word e, and by the set of one-way infinite words over A. Given an 
infinite word u G A''^, we denote by F(u) the set of factors (or subwords) of u, 
and, for any n G IN, by F„(u) the set of factors of length n of u. 

The shift is the operator T on A''^ defined by T{uqUiU 2 U 3 ■ ■ ■) = U 1 U 2 U 3 . . .. 
The set A'^ equipped with the product topology is a compact topological space, 
and under the action of T it becomes a discrete dynamical system named the 
full shift. A closed subset of A'^ invariant under T is a subshift, and in particular 
any infinite word u G A''^ generates a subshift 0(u), the adherence of 0(u) = 
{T”(u):n G IN}. 
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3 Recurrence 

3.1 Recurrent and Uniformly Recurrent Words 

An infinite word u is said to be recurrent if any factor of u occurs infinitely 
often in u. Recurrence can be characterized using an apparently much weaker 
property: 

Proposition 1. An infinite word u G is recurrent if and only if any prefix 
of u occurs at least twice in u. 

An infinite word is said to be uniformly recurrent if it is recurrent and addi- 
tionally, for any factor w of u, the distance between two consecutive occurrences 
of re in u is bounded by a constant that depends only on w. 

For instance, any purely periodic infinite word is uniformly recurrent. Many 
classical infinite words like the Thue-Morse and Fibonacci words are uniformly 
recurrent. An eventually periodic word which is not purely periodic is not recur- 
rent. The word 

0101101110111101111101111110111111101111111101111111110111111111 . . . 

(where the number of ones between consecutive zeros increases each time by one) 
is not recurrent and cannot be made recurrent by removing a prefix. 

There are infinite words which are recurrent but not uniformly recurrent. Ex- 
amples are easily constructed as fixed points of substitutions on A* . For instance, 
the word 

0101110101111111110101110101111111111111111111111111110101110101 . . . 

is a fixed point of the substitution 0 i— > 010, 1 i-^- 111. It is recurrent but not 
uniformly recurrent (since it is a fixed point of a substitution, it is sufficient 
for this to show that both symbols 0 and 1 occur infinitely often, but with 
unbounded intervals in the case of 0). Note that this word can also be defined 
as the characteristic word of the set of nonnegative integers that have at least 
one 1 in their ternary expansion. 

3.2 The Recurrence Function 

The recurrence function of an infinite word u is the function R^^: IN — > INU{-|-oo} 
defined by 

i?u(n) = inf {{N G IN: Vw G Fat(u),F„(w) = F„(u)} U {-koo}) . 

In other words, Ru{n) is the size of the smallest window such that, whatever 
the position of the window on u, all factors of length n that occur in u occur 
at least once inside the window. We shall write R{n) = Ru{n) when there is 
no ambiguity on the relevant infinite word (this convention also applies to other 
notations with an infinite word as an index). 
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For instance, in the Thue-Morse word 

0110100110010110100101100110100110010110011010010110100110010110 . . . 

(the fixed point of the substitution 6 with d(0) = 01 and 0(1) = 10), we have 
R{0) = 0, i?(l) = 3 (every factor of length 3 contains at least one 0 and one 1), 
R{2) = 9 (the factor 01011010 of length 8 does not contain 00), i?(3) = 11, etc. 
The recurrence function of this word is studied in detail in [13]. We shall present 
in Sect. 5 a method to compute R{n) in general for words similar to this one. 

An infinite word u is uniformly recurrent if and only if R^^ takes only finite 
values. 

з. 3 Return Times and Return Words 

A closely related notion is that of return time. Given a factor ru of a recurrent 
infinite word u = uqU\U 2 ■ ■ ■, an integer i is the position of an occurrence of w 
in u if UiUi+i . . .Ui+|u,|_i = w. Let us denote by iq the smallest such position, 
by i\ the next one, etc., so that (io, *3 ■ . •) is the increasing sequence of all 
positions of occurrences of w in u. Then define the set of words 

ru{w) = {ui^Ui^+i . . j G IN} . 

Elements of r^iw) are called return words of w in u, and the (possibly infinite) 
number £u(ie) = supll-ul: v G ru(ic)| is called the (maximal) return time of w in 

и. Note that return words of w either have rc as a prefix, or are prefixes of w. 
The latter case happens when two occurrences of rc in u overlap. 

Finally, for all n G IN define lu{n) = max{f'u(ii'): w G F„(u)}. Then 

Proposition 2. For any recurrent infinite word u G and for any n G IN, 
one has Ru{n) = £u{n) + n — 1 (with the convention that +oo + n — 1 = +oo ). 

For instance, consider the Fibonacci word 

0100101001001010010100100101001001010010100100101001010010010100 . . . 

(the fixed point of the substitution 0 i— > 01, 1 i— > 0). The factor 0010 occurs 
at positions 2, 7, 10, 15, 20, 23, etc. and two return words can be observed in 
the prefix of u shown here, 001 and 00101. In fact, r(OOlO) = {001,00101} and 
£(0010) = 5, but other factors of the same length have longer return times and 
£(4) = £(0101) = 8, hence R{4) = 11 

Return times have a direct dynamic interpretation. In the subshift X = 
0(u) generated by u, given a finite word w G F{u), the set [w] = (v G 
is a prefix of v} is both open and closed and is called a cylinder. Then 
the definition of £u(ic) can be rephrased as 

£u(w) = infjiV G IN:Vv G [ru],3n G IN, 1 < n < and T"(v) G [w]} , 

i.e., £u{w) is the maximum time before which the system returns to the cylinder 
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The set r^^w) of return words of a factor w is always a circular code, and if 
to is the position of the first occurrence of w in u, then T*“(u) can be factored 
over this code. In particular, if w is a prefix of u and ru(w) is finite, then u can 
be recoded as u = /(v), where v G is an infinite word on a new alphabet B, 
and / is a one-to-one map from B to r^iw), extended as a substitution. Such a 
word V = Z\t„(u) is said to be derivated from u. The following characterization 
is due to F. Durand (a substitution /: A* A* is primitive if there exists an 
integer n > 1 such that, for all a G A, f^{a) contains every letter of A at least 
once): 

Theorem 1 (Durand [9]). An infinite word u G A^ is a fixed point of a 
primitive substitution on a subset of A if and only if u is uniformly recurrent 
and the number of distinct (up to letter renaming) infinite words derivated from 
u is finite. 

For instance, the Thue-Morse word, which is a fixed point of a primitive 
substitution, has three distinct derivated words, the Thue-Morse word t itself: 

0110100110010110100101100110100110010110011010010110100110010110 . . . 

the derivated word associated with the prefix 0, /\o(t), with t = /i(Z\o(t)) where 
/i(0) = Oil, /i(l) = 01, /i(2) = 0: 

0120210121020120210201210120210121020121012021020120210121020120 . . . 

and the derivated word associated with all prefixes of length 2 or more, v = 
Z\oi(t) = /\oii(t) = •••, with t = / 2 (v) = / 3 (v) = •••, where /2(0) = Oil, 
/2(1) = 010, /2(2) = 0110, /2(3) = 01, and / 2 +fc = 9>^ o for all fc G IN: 

0123013201232013012301320130123201230132012320130123201230132013 . . . 

3.4 Recurrence and Subword Complexity 

Another numerical function associated with an infinite word u is the (subword) 
complexity function Pu'T^ — *■ IN defined by Pu(n) = #F„(u), the number of 
factors of length n in u. There is no direct relation between the functions Pu and 
i?u, but only an inequality. 

Proposition 3 (Morse and Hedlund [13]). For any infinite word u G 
and for any n G IN, one has £u(n) > Pu(n) and Ru(ji) > Pu{ti) + n — 1. 

For non-periodic words, this inequality is not optimal and Morse and Hedlund 
show that it can be improved to i?u(n) > Pu{n) + n. 

In the other direction, no such inequality holds since it is possible to construct 
infinite words with p{n) = n + 1 (Sturmian words) for which i?(n) grows as fast 
as desired, while remaining finite. 
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4 Linear Recurrence 

4.1 Linearly Recurrent Words 

When the recurrence function grows slowly, it means that all factors have to oc- 
cur rather often and this gives much structure to the infinite word. Of particular 
interest are words for which R{n) jn is bounded. An infinite word is said to be 
linearly recurrent with constant K if £(n) < Kn for all n > 1 [11]. 

Proposition 4 (Durand, Host, and Skau [11]). Let u G be a linearly 
recurrent infinite word with constant K. Then 

(i) For all n> 1, R{n) < {K + l)n — 1. 

(ii) For all n > 1, p(n) < Kn. 

(iii) u is {K + 1) -power free (i.e, it does not contain any factor of the form 

yjith w £ A* \ {e } ). 

(iv) For all w G F{u) and v G r^^{w), \w\/K < jwj < K\w\. 

(v) For all w G F{u), ffr^{w) < K{K 1)^. 

Property (iv) shows that in a linearly recurrent word, return times can be 
neither too long nor too short. By property (ii), linearly recurrent words are a 
special case of words with linear subword complexity. In particular, this implies 
that Pu{n -I- 1) — Pu{n) is bounded by a constant that depends only on K and 
#A [3]. 

The structure of linearly recurrent words can be characterized using primi- 
tive S-adic infinite words, words obtained by applying in an appropriate order 
substitutions taken from a finite set (see [10] for a precise definition): 

Theorem 2 (Durand [10]). An infinite word u is linearly recurrent if and 
only if it is an element of the subshift generated by some primitive S-adic infinite 
word. 

4.2 Recurrence Quotient 

Another way to define linearly recurrent words is to use the recurrence quotient 
Pu- For any infinite word u, let 

Pu = lim sup G IR U {-l-oo} . 

n— ?^+oo Tl 

Then p„ is a finite real number if and only if u is linearly recurrent. Moreover, 
if u is linearly recurrent with constant K, Pu < K -\- 1. 

If u is a purely periodic word, it is clear that pu = 1. For non-periodic words, 
Hedlund and Morse [13] asked as an open problem to find the best lower bound 
for Pu. Proposition 3 together with the fact that Pu{n) > n -\- 1 ([13]) implies 
that Pu > 2. Using graph representations, we improve this result to 

Theorem 3. Let u G be an infinite word which is not purely periodic. Then 
Pu > 3. 
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Rauzy [16] conjectured that the minimal value of pu for non-periodic word 
is still larger: 

Conjecture 1 (Rauzy [16]). Let u G be an infinite word which is not purely 
periodic. Then p„ > — 3-618. 

This value (5 -I- \/5)/2 is exactly the recurrence quotient of the Fibonacci word 
(see below), so if the conjecture holds then it is optimal. We believe that the 
techniques used to prove Theorem 3 (see Sect. 7), and in particular the extensive 
study of possible Rauzy graphs, will lead to a proof of this conjecture. 

5 Computing the Recurrence Function 

5.1 Singular Factors 

Wen and Wen [18] defined singular words as particular factors of the Fibonacci 
word (the factors 0, 1, 00, 101, 00100, 10100101, etc., of length the successive 
Fibonacci numbers, that when concatenated in this order yield the infinite word 
itself). Here we define singular factors for any infinite word, generalizing one of 
the properties of Wen and Wen’s singular words. 

Let u be an infinite word. A factor ru of u is said to be singular for u if 
|rc| = 1 or if there exist a word v G A* and letters x,x]y,y' G A such that 
w = xvy, X ^ x', y ^ y' and {xvy,x'vy,xvy'} C F{u). In other words, a factor 
w is singular if there is a way to alter its first letter and still have a factor of u, 
and symmetrically with the last letter. 

When w = xvy is singular, then v is bispecial, i.e., v can be extended in at 
least two different ways both to the right and to the left (see [4]). 

Proposition 5 ([7]). Let u be an infinite word and n > 1. If £{n — 1) < i{n), 
then there exists a singular factor w of u such that i{n) = i{w). 

A singular factor w is said to be an essential singular factor if £{w) = ^drc]) > 
£{\w\ — 1). We denote by 5'(u) the set of singular factors of u, and by 5"(u) the 
set of essential singular factors of u. 

Theorem 4 ([7]). Let u be an infinite word and n > 1. Then 

£{n) = sup{£(ru): w G 5'(u) and |r(;| < n} = sup{£(ru): w G S'{u) and |r(;| < n} . 

5.2 Computation Method 

Theorem 4 allows to explicitly compute the recurrence function R(n) as long 
as one is able to describe singular factors (or at least essential singular factors) 
and their return time. Since singular factors are extensions of bispecial factors, 
techniques presented in [4] can be used to describe them when the infinite word is 
a fixed point of a substitution, or more generally when it is defined using a finite 
number of substitutions (S-adic words). This results in the following procedure: 
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1. Determine bispecial factors. Usually a small number of bispecial factors of 
small length generate all other bispecial factors through recurrence relations. 

2. Deduce the form of singular words, and compute their length. 

3. For a given singular words, determine the associated return words and com- 
pute their length. Singular words with shorter return time can be left out 
since they are not essential. 

4. Deduce the function £{n), which will be typically staircase-like, the position 
and height of each step being expressed with a (usually linear) recurrence 
relation. 

As an example, let us apply this procedure to the Thue-Morse word. 

1 . Apart from the empty word and letters, there are four families of bispecial 
factors: 01, 10, 010 and 101 are bispecial, and if w is bispecial then 9{w) is 
also bispecial. 

2. Bispecial factors in the families generated by 01 and 10 each give rise to four 
singular factors, which can be summarized as x9^{y)z with x,y,z G {0,1} 
and k > 1, of length 2^ -|- 2. The two other families do not produce singular 
factors (because they are weak bispecial factors), and the remaining singular 
factors are all words of length 1 and 2, as well as 010 and 101. 

3. Observation yields 

r(0) = {0,01,001} , 

r(00) = {0011,001101,001011,00101101} , 
r(01) = {01,010,011,0110} , 
r(OlO) = {010,01011,0100110,010110011} , 

the case of 1, 11, 10, and 101 being symmetric. The word 0010 always occurs 
in the form 00(00)1“^, hence 

r(OOlO) = {06»(u)0-^:v G r(00)} 

= { 00101101 , 001011010011 , 001011001101 , 0010110011010011 } . 

Similarly, 

r(OOll) = {l"^6»(u)l:u G r(lOl)} , 
r(lOlO) = {0(u): u G r(ll)} , and 
r(lOll) = {0-i6»(u)0:u G r(00)} . 

Then the return words of x9^^^{y)z can be deduced from those of x9^{y)z by 
applying 9 and conjugating by x. One has £(0010) = £(1010) = £(1011) = 16 
and £(0011) = 18, so obviously only 0011 and the family it generates are 
essential. Essential singular factors of length 4 or more therefore have length 
2* -I- 2 and return time 9.2^ (actually, this also holds for fc = 0). 

4. The function £(n) is defined by £(0) = 1, £(1) = 3, £(2) = 8, and £(n) = 9.2^^ 
for 2^ -I- 2 < n < 2*+^ -I- 1. Consequently p=l + limsup£(n)/n = 10. 
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6 The Recurrence Quotient of Sturmian Words 

6.1 Computing p Using Continued Fractions 

Sturmian words are infinite words for which p(n) = n + 1; see [12] for equivalent 
definitions, properties and references. They are all uniformly recurrent. 

In this particular case, the method given in the previous section to compute 
the recurrence function amounts to the method described by Morse and Hedlund 
[14] using continued fraction expansions. As far as the recurrence function is 
concerned, it is sufficient to study standard Sturmian words: given an irrational 
number a G [0,1] \ (Q, the standard Sturmian word of density a is the word 
u = U(fUiU2 ■ ■ ■ where = 1 if the fractional part of (n + 2)a is less than a, 
Un = 0 otherwise. 

Proposition 6 (Morse and Hedlund [14]). The essential singular factors of 
the standard Sturmian word u of density o? € [0, 1] \ Q constitute a sequence (wi) 
with [rcil = Qi and i{wi) = Qi + Qi+i, where Pi/qi are the convergents associated 
with the continued fraction expansion of the density, a = [0; oi, 02, 03, . . .]. The 
recurrence quotient of u is p = 2 + limsup [ap a^-i, . . . , oi]. 

For instance, the Fibonacci word is the standard Sturmian word of density 
a = (3 — 'Jh)j2 = [0; 2, 1, 1, . . .]. Its recurrence quotient is therefore p = 2 + 
limsup [1; 1, 1, 1, . . . , 1] = [3; 1, 1, 1, . . .] = (5 + •\/5)/2. The denominators of the 
convergents are the classical Fibonacci numbers, q^ = F\ = 1, q^ = F2 = 2, 
<?2 = A3 = 3, qs = F4 = 5, q4 = F5 = 8, etc. and they correspond to the lengths 
of the essential singular factors, wq = I, wi = 00, W2 = 101, W3 = 00100, 
W4 = 10100101, etc. The associated return times are (.{wf) = qi + qi+i = Ai+3. 
Finally, the recurrence function satisfies R{n) = Ai+2 + n— lifAj<n< Ai+i. 

A consequence of this proposition is that a Sturmian word is linearly recurrent 
if and only if the continued fraction expansion of its density is bounded. Then, 
if a = limsup a^, one has a + 2<p<a + 3. 

6.2 The Spectrum of Values of p 

Let S' C IR U {+00} be the set of values taken by p for Sturmian words. The 
set S has an interesting topological structure (we treat sequences of integers 
t> = (&i)i6iN as infinite words on the infinite alphabet IN*, so that the notation 
[b] means [&o; &i, &2, &3, ■ • •] and [T'=(b)] = [6^,; &fc+i, 6^+2, 6^+3, .. •]): 

Theorem 5 ([7]). The set S is given by 

S = {2 + [b]: b G (IN*)’^ and Vfc G IN, [b] > [T'''(b)]} U {+00} . 

It is a compact subset of [0,+oo], with empty interior. It has the power of the 
continuum. Its smallest accumulation point is the transcendental number po = 
2 + [v] ~ 4.58565, where v = U0U1V2 . . . G {1,2}''^ is the fixed point of the 
substitution 1 2, 2 1— > 211. The intersection of S with the set of quadratic 

numbers is dense in S. Every non-countable interval of S contains a sub-interval 
which is isomorphic to S as an ordered set. 
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The transcendence of po was proved by Allouche et al. in [2]. Some questions 
remain open about the structure of S, for instance its Hausdorff dimension. 



7 Main Ideas for the Proof of Theorem 3 

Assume that u € is a non-periodic infinite word with p < 3. Assume also 
that u is a binary word (i.e., #A = 2), since the general case can easily be 
reduced to the binary case by projection. 

Let s(n) = p{n -I- 1) — p(n): since u is not eventually periodic, s{n) > 1 for 
all n G IN. By Proposition 3, limsupp(n)/n < 2, which implies that s(n) = 1 
for infinitely many values of n. There are now two cases: either there is some 
no such that s(n) = 1 for all n > no, or there are infinitely many n such that 
s(n) = 1 and s(n -I- 1) > 1. 

The first case is essentially the case of Sturmian words, and it is not difficult 
to adapt the method of [14] to prove that p > (5 -I- y/5 ) /2 > 3 in this case. 

In the second case, we have infinitely many n for which the Rauzy graph (see 
[16,4]) is “eight-shaped” and contains a strong bispecial factor. For subsequent 
values of n, the Rauzy graphs get more complicated, and it is possible to express 
return times of certain words as lengths of paths in these graphs, for which lower 
bounds can be given. Combining these bounds, we get a contradiction with the 
assumption p < 3. 

To prove a lower bound for p larger than 3, one would have to consider also 
infinite words for which liminf s(n) = 2. Rauzy graphs for these words can have 
ten different shapes, which where first classified by Rote [17], and the study of 
their evolutions would involve a large number of subcases. 



8 Two Other Functions 

Two functions associated with an infinite word u and similar to the recurrence 
function have also been considered. The first one is R'{n), the size of the smallest 
prefix ic of u such that Fn{w) = F'„(u), defined by Allouche and Bousquet-Melou 
[1] to study a conjecture of Pomerance, Robson, and Shallit [15]. The second one 
is R"{n), the size of the smallest factor ic of u such that Fn{w) = A„(u), studied 
in [6]. These functions compare with each other as follows: 

Proposition 7 ([6]). For any infinite word u G and any n G IN, the func- 
tions p-^, Ru, R'u! o,nd i?" satisfy the inequality p^{n)-\-n—l < R'f^{n) < R'^^{n) < 

i?u(n). 

It should be noted that, whereas the functions R and R” depend only on the 
set of factors of u, or equivalently on the subshift generated by u, the function 
R' depends on the specific word u. The conjecture by Shallit et al., rephrased 
using the function R' , was very similar to Conjecture 1: if u is an infinite word 
which is not eventually periodic, then lim sup i?'(n)/n > (3 -I- -\/5)/2. This is in- 
deed true for standard Sturmian words, but considering non-standard Sturmian 
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words (which have the same factors, but not the same prefixes) we were able to 
construct a counter-example, and to prove its optimality: 



Theorem 6 ([5]). Let u G he an infinite word that is not eventually peri- 
odic. Then 



lim sup 

n— ^+oo 



R'^{n) 29-2710 



> 



~ 2.51949 



and this value is optimal since it is attained by the Sturmian word 



Z3 = 010010100100100101001001010010010010100100100101001001010010 . . . 



fixed point of the substitution 0 i— > 01001010, 1 i— > 010. 

The function R"{n) seems to have less interesting properties. It is not difficult 
to see that the minimal value of lim sup i?"(n)/n for an infinite word that is not 
eventually periodic is 2. This minimal value is attained, among others, by all 
Sturmian words, which can in fact be characterized using R": 

Proposition 8 ([6]). An infinite word u G is Sturmian if and only if 
i?"(n) = 2n for every n > 0. 



9 Concluding Remarks 

Some of the properties that we have presented deal with the connections between 
recurrence and other properties of infinite words: subword complexity, frequen- 
cies, repetitions, special factors, etc. These connections have not been completely 
explored: for instance, the inequality between p{n) and R{n) in Proposition 3 
can certainly be improved. We mainly focused on linearly recurrent words, and 
did not say much about recurrence of infinite words with very high subword 
complexity: while complete words (i.e., infinite words with maximal complexity 
p(^) = 7^") c^re not uniformly recurrent, how fast can the complexity of a 
uniformly recurrent word grow? 

Not much is known about the spectrum of values taken by p for all infinite 
words, not just Sturmian ones. A proof of Conjecture 1 would provide the min- 
imum of this spectrum, but there are many other questions on its structure. 
Is it similar to that of S in Theorem 5, or does it contain full intervals of real 
numbers? It seems that at least the minimum is an isolated point, what is the 
smallest accumulation point, is it different from pfil 

Another problem from [13] is still open: is it true in general that R{n)/n does 
not converge to a limit? Some progress in this direction has been recently made 
by N. Chekhova [8]. 
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1 Introduction 

Descriptive complexity theory provides a convenient and intuitive way to model a large 
variety of computational problems. The basic problem studied in descriptive complexity 
is of the following form: 

Given a finite relational structure A and a sentence ip of some logic L, decide 
if If is satisfied by A. 

We call this problem the model-checking problem for L. The model-checking problem 
and variants of it appear in different areas of computer science: 

The name model-checking is usually associated with automated verification. Here 
state spaces of finite state systems are modeled by Kripke structures and specifications 
are formulated in modal or temporal logics. Then checking whether the system has the 
specified property means checking whether the Kripke structure satisfies the specifica- 
tion, that is, model-checking. In recent years, this approach has very successfully been 
applied to the verification circuits and protocols. 

There has always been a close connection between descriptive complexity theory 
and database theory. As a matter of fact, some of the roots of descriptive complexity can 
be found in research on the expressiveness and complexity of database query languages 
(e.g. [3,3 1]). The basic link between the two areas is that relational databases and finite 
relational structures are just the same. Therefore, the problem of evaluating a Boolean 
query of some query language L against a relational database is the model-checking 
problem for L. (A Boolean query is a query with a yes/no answer.) More generally, 
evaluating a /c-ary query ip{xi , . . . , Xfc) in a structure (or database) A amounts to com- 
puting all tuples (oi, . . . , Ofe) such that A satisfies , Ofe)- We call this variant 

of the model-checking problem the evaluation problem for L.The most important logic 
to be considered in this database context is first-order logic, which closely resembles 
the commercial standard query language SQL. 

A third application area for model-checking problems is artificial intelligence. Con- 
straint satisfaction problems can easily be formulated in terms of the model-checking 
problem for monadic second-order logic. This observation goes back to Feder and Vardi 
[11]. Research of the last few years has shown that there is an intimate connection be- 
tween constraint satisfaction problems and database theory [25,32,19]. 

Model-checking problems can also be used as a framework for reasoning about stan- 
dard problems considered in algorithms and complexity theory. To model, for example, 
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the clique-problem by a model-checking problem, for every fc > 1 we write a sentence 
V^dique of first-order logic saying that a graph G has a clique of size k. This shows that 
the clique problem can be seen as a special case of the model-checking problem for 
first-order logic. Similarly, we can consider the graph coloring problem as a special 
case of the model-checking problem for monadic second-order logic. Thus algorithms 
for model-checking problems can also be seen as “meta-algorithms” for more concrete 
problems. While the algorithms for concrete problems obtained this way will usually 
not be the most efficient, they often highlight the structural reasons making the prob- 
lems tractable. Moreover, these meta-algorithms can be taken as a starting point for the 
development of more refined algorithms taking into account the special properties of 
the particular problem at hand. 

Often, we are not only interested in a model-checking problem itself, but also in 
certain variants. One example is the evaluation of database queries - model-checking in 
the strict sense only corresponds to the evaluation of Boolean queries, that is, queries 
with a yes/no answer, but queries whose output is a set of tuples of database entries 
also need to be evaluated. Constraint satisfaction problems provide another example - 
usually, we are not only interested in the question of whether a constraint satisfaction 
problem has a solution, but we actually want to construct a solution. Sometimes, we 
want to count the number of solutions, or generate a random solution, or construct a 
solution that is optimal in some sense. We refer to such problems as generalized model- 
checking problems. An abstract setting for studying the complexity of such problems is 
given in [21]. 

As the title suggests, we focus on first-order logic here. Examples of problems that 
can be described as generalized model-checking problems for first-order logic are given 
in Section 2.4, among them such well-known problems as CLIQUE, DOMINATING Set, 
Subgraph Isomorphism, Homomorphism, and Conjunctive Query Evalu- 
ation. The main purpose of the paper is to give a survey of known results and explain 
the basic techniques applied to prove them. One new result, which nicely illustrates 
the use of locality in model-checking algorithms, is concerned with first-order model- 
checking on graphs of low degree and on sparse random graphs. 



2 Generalized Model- Checking Problems 

2.1 Structures and Queries 

A vocabulary is a finite set r of relation symbols. Associated with every relation symbol 
i? is a positive integer ar{R), the arity of R. A r-structure A consists of a set A called 
the universe of A and, for every R £ t, an ar(i?)-ary relation R-^ C In this 

paper, we only consider structures whose universe is finite. STR denotes the class of 
all (finite) structures. If C is a class of structures, C[r] denotes the subclass of all t- 
structures in C. 

For example, we can consider graphs as {i?}-structures Q, where is a binary 
relation symbol and E'^ is symmetric and anti-reflexive. 

Hypergraphs can be modeled as {V, /}-structures Ti., where V is unary and I is 
binary and C I/^ x H \ (The hyperedges ofH are the sets {v G | (v, e) G 
I^}, fore e H\V^.) 
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Boolean circuits can be modeled as {i?, A, iV}-structures B, where E is binary, 
A, N are unary, and {B, E^) is a directed acyclic graph that has precisely one vertex 
of out-degree 0 (the output node), is a subset of all vertices with in-degree at least 
2 (the and nodes', all other vertices of in-degree at least 2 are considered as or-nodes), 
and is a subset of the nodes of in-degree 1 (the negation nodes). 

2.2 First-Order Logic 

Atomic formulas, or atoms, are expressions of the form x = y or Rx\ . . .Xr, where R is 
an r-ary relation symbol and x,y,xi, . . . ,Xr are variables. The formulas of first-order 
logic are build up in the usual way from the atomic formulas using the connectives 
and the quantifiers V, 3. 

The class of all first-order formulas is denoted by FO. The vocabulary of a first- 
order formula p, denoted by voc((/?), is the set of all relation symbols occurring in p. 
If C FO is a class of first-order formulas, then <?[r] denotes the class of all (/? G 
with voc((/?) C T. A free variable of a first-order formula is a variable x not in the 
scope of a quantifier 3x or Vx. The set of all free variables of a formula (p is denoted by 
free(v3). The notation (p{xi, . . . , Xk) indicates that free((/?) = {x\, . . . , Xk\. A sentence 
is a formula without free variables. 

For a formula ip{x\, . . . ,Xk) G FO[t], a r-stmcture A, and ai, ... ,ak G A, we 
write A ^ p{ai, ... ,ak) to say that A satisfies p if the variables xi, . . . , Xfc are inter- 
preted by the elements a\,. . . ,Ok, respectively. We let 

p{A) '.= {(oi, . . . ,Ofc) G A'" \^\= ‘fiiai, ■ ■ ■ 

To extend this definition to sentences in a reasonable way, we identify the set consisting 
of the empty tuple with True and the empty set with False. 

2.3 Generalized Model-Checking Problems 

The basic model-checking problem asks whether a given structure satisfies a given sen- 
tence. In our more general setting, we shall consider formulas with free variables. For 
every class <P C FO of formulas and every class C C STR of structures we consider the 
following four basic problems: 

The input is always a structure A G C and a formula p € 

The decision problem. Decide if p{A) is non-empty. Essentially, this problem is the 
same as the model-checking problem. Therfore, we refer to this problem as <b- 
Model-Checking on C. 

The construction problem. Find a tuple d G p{A) if such a tuple exists. We refer to this 
problem as t?-CONSTRUCTlON on C. 

The listing problem. Compute the set p{A). Because of its database application, we 
refer to this problem as (^-EVALUATION on C. 

The counting problem. Compute the size of p{A). We refer to this problem as <b- 
COUNTING on C. 



If C is the class STR of all structures, we usually do not mention it explicitly and speak 
of ^-Model-Checking, (^-Construction, et cetera. 
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Another interesting problem is the sampling problem, that is, the problem of gener- 
ating a random element of p{A). To model combinatorial optimization problems, one 
may consider structures A on which a suitable cost-function is defined and then search 
for optimal solutions in p{A). But in this paper, we focus on the four basic problems. 

This uniform view on combinatorial problems associated with a binary relation that 
relates instances (in our case pairs {A, p) G C x (b) with solutions (d € p{A) for us) is 
well-studied in complexity theory [17,29,23]. An abstract model-theoretic framework 
for considering such problems as generalized model-checking problems is presented in 
[ 21 ]. 



2.4 Examples 

Before we proceed, let us consider a few examples of problems that can be described 
as generalized model-checking problems for first-order logic. 

Example 1. Let V5dique(2^L • ■ ■ , Xfc) be the formula ExiXj, and let (^clique := 

{‘/’dique I ^ — !}■ Observe that for every graph Q, (/?dique(^) Ihe set of all /c-cliques of 
Q. Thus (^CLIQUE -Model-Checking on the class of all graphs is just the well-known 
Clique problem. ^clique-Construction is the problem of finding a clique of given 
size in a given graph. <?clique -EVALUATION is the problem of finding all cliques of 
given size in a given graph, and <?clique-COUNTING is the problem of counting the 
number of such cliques. 

Two well-known generalizations of the CLIQUE problem are the Graph Homo- 
morphism problem and the SUBGRAPH ISOMORPHISM problem: 

Example 2. For every r-stmcture A with universe A = {m, . . . , Ofc}, let 



‘Pbom 



{xi, 



,Xk 



)- A A 



Rxi 



RSt (oi 
r-ary 



r)6-R^ 



and let ^hom := I A G STR}. Then for every r-stmcture B, is the 

set of homomorphic images of A in B. Thus <?hom-Model-Checking is the same as 
Homomorphism (on relational stmctures). 

Note that v^dique = Vhom the complete graph JC with k vertices. 

Adding clauses stating that the variables are pairwise distinct to the formulas in 
^hom, we obtain a class ^sub of formulas whose Model-Checking problem is the 
Substructure Isomorphism problem. 

Another slight modification yields a description of INDUCED SUBSTRUCTURE ISO- 
MORPHISM. 

Homomorphism is closely related to the following problem playing a fundamental 
role in database theory: 

Example 3. Conjunctive queries are relational database queries that can be described 
by first-order formulas of the form 3yi . . . (ai A . . . A a„) , where , . . . , a„ are 
atomic formulas. Let CQ denote the class of all such formulas. Note that ^*hom C CQ. 

CQ-Evaluation is the problem of evaluating a conjunctive query against a finite 
relational database. 
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From a logical point of view, the formulas considered so far are all very simple; 
they are all quantifier-free or existential first-order formulas. The following examples 
require more complicated formulas. 

Example 4. For every k > 1, let . . . , Xk) ■= Vj/ V^=i(2^i = J/ V Exty), and 

let t^DS := {y^os I A > !}• Then for every graph Q, is the set of all dominating 
sets of Q of size k. Thus <?ds -Model-Checking on the class of all graphs is the 
Dominating Set problem. 

Example 5. For all r-stmetures A C with universes A = {m, . . . , ak} and A+ = 
A U . . . , tz/ }, let 

:=Vxi ...Xk{(p^b{xi,---,Xk) 3xfc+i ...xi(p^l{xi,...,xi)), 

and let := I A C G STR}. Then for every structure B we have 

{^) = True if, and only if, every substructure of B that is isomorphic to A can 
be extended to a substructure isomorphic to A~^ . 

t^ext-MODEL-CHECKlNG is the EXTENSION problem for relational structures. 



Example 6. Recall that Boolean circuits can be viewed as {E, A, iV}-structures. The 
depth of a Boolean circuit is the length of the longest path from an input node to the 
output node. It is easy to find, for all d, fc > 1, a first-order formula <p^li{xi , . . . , Xk) 
such that for every Boolean cireuit C, a tuple (ci, . . . , Cfc) G C* is contained in 
if, and only if, the assignment that sets ci, . . . , to True and all other input nodes to 
False satisfies C. Note that the formula may have up to d quantifier alternations. 
Let t^sat := {‘Psit I d, fc > 1} Then for ^sat-MODEL-CHECKlNG is essentially the 
Circuit Satisfiability problem. 

Example 7. Using the well-known equivalence of the relational calculus and first- 
order logic, we observe that FO-Evaluation is equivalent to the problem of evaluating 
relational calculus queries against finite relational databases. 

2.5 Complexity 

Our underlying model of computation is the standard RAM-model with addition and 
subtraction as arithmetic operations (cf [1,30]). In our complexity analysis we use the 
uniform cost measure. Structures are coded in a straightforward way by first describing 
their vocabulary, then listing all elements of the universe, listing all tuples of all rela- 
tions, and then listing the constants. For details, we refer the reader to [12]. The length 
of the encoding of a structure A is denoted by 1 1^| |. For a fixed vocabulary r we have 
||.4|| G 0{\A\ -F ^ STR[t]. For instance, for graphs Q with n 

vertices and m edges this means ||t]|| G 0{n -F m). We fix some reasonable way to 
encode formulas; the length of the encoding of a formula ip is denoted by 1 1(/?| |. 

The most straightforward way to measure the complexity of a model-checking prob- 
lem is to measure it just in the length of the input, that is, in 1 1.4| | -F Hv?!!- This is usually 
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referred to as the combined complexity of the problem [31]. In partieular, we say that 
the eombined eomplexity of <P-Model-Checking on C is in PTIME if there is a poly- 
nomial p(X) and an algorithm solving the problem in time at most p(||^|| -F 1 1(/?| |)- The 
same definition applies to the other generalized model-eheeking problem. 

However, sinee may eontain formulas with arbitrarily many free-variables, in 
general there is no polynomial bound on the size of p{A) (in terms of ||^|| -F ||v?||)- 
There are different eomplexity theoretie notions to handle listing problems with po- 
tentially large output (see, for example, [18,24]). The simplest is to measure the com- 
plexity both in terms of the size of the input and the size of the output. We say that 
the combined complexity of ^-EVALUATION on C is in PTT {polynomial total time) 
if there is a polynomial p{X) and an algorithm solving the problem in time at most 
P(ll-^ll + llv^ll + ll‘F’(-^)ll)- A stricter notion is that of polynomial delay . An f{n)-delay 
algorithm for a listing problem is an algorithm that generates its first output in at most 
f{n) steps and thereafter never takes more than /(n) steps between generating two out- 
puts. We say that <P-EVALUATI0N on C is in PD if there is a polynomial p{X) and a 
P(l l-^l I + I D'delay algorithm solving the problem. 

Theorem 8 (Stockmeyer [28], Vardi [31]). The combined complexity of FO-MODEL- 
Checking is PSPACE complete. 

Of course under the assumption that PTIME PSPACE, this implies that the com- 
bined complexity of none of the generalized model-checking problems for FO is in 
PTIME, or PTT for the listing problem. 

The proof of the hardness part of Theorem 8 is by a reduction from the QUANTIFIED 
Boolean Formula problem — essentially. Quantified Boolean Formula is 
the same as FO -model-checking on a structure with just two distinguishable elements 
representing the Boolean values True and False. 

Thus the high complexity of FO-Model-Checking is caused by large and com- 
plicated input formulas. The input structure only plays a very limited role, it can actually 
be a fixed two-element structure whose vocabulary only consists of one unary relation 
symbol. However, instances of practical problems modeled by model-checking prob- 
lems usually consist of large input structures representing some “real-world” data such 
as a network or a relational database, and much smaller formulas representing a spec- 
ification of the information we want to gather from these data. Thus the relevance of 
Theorem 8 is somewhat limited. 

The simplest way to take into account that usually the input structure is much larger 
than the input formula is to completely disregard the formula and measure the com- 
plexity just in terms of the size of the input structure. This complexity measure is 
known as the data complexity [31]. In particular, we say that the data complexity of 
<?-Model-Checking on C is in PTIME if for every formula G there is a poly- 
nomial p^{X) and an algorithm that solves the model-checking problem in at most 
p^(| 1^1 1) steps.* It is easy to see that the data complexity of FO-Model-Checking, 

* Here we may consider both a uniform version where there is just one algorithm solving the 
model-checking problem, or the non-uniform version where for every formula p there is an 
algorithm solving {^}-Model-Checking on C in time Pip{X). Usually, data complexity 
refers to the non-uniform version. 
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FO-Construction, FO-Evaluation, and FO-Counting is in PTIME. More pre- 
cisely, there is an 1 1^| 1*^(1 1 I ^-algorithm for each of the problems. 

However, this is still not completely satisfying because even for very small input 
formulas ip, cannot be seen as a feasible complexity. On the other hand, 

a complexity such as • ||yl|| is acceptable. Parameterized complexity is a refined 
measure taking these considerations into account. It may often be the most appropri- 
ate way to measure the complexity of generalized model-checking problems. We say 
that <P-Model-Checking on C is fixed-parameter tractable, or equivalently that the 
parameterized complexity of <P-Model-Checking on C is in FPT, if there is a con- 
stant c > 0, a computable function /, and an algorithm solving the problem in time at 
most /(I |(/?| I) • ||-4||'^.** The parameterized complexity of the other generalized model- 
checking problems can be defined analogously. Similarly as for the combined complex- 
ity, we can usually not expect the parameterized complexity of tP-E VALUATION to be in 
FPT simply because the output may get too large. It is straightforward to define param- 
eterized analogues of the classes PTT and PD, which we refer to as FPTTT and FPTD. 
For further background in parameterized complexity theory, we refer the reader to [9]. 

The following theorem can be seen as the parameterized analogue of Theorem 8. 
Under the complexity theoretic assumption that the parameterized complexity classes 
AW[*] and FPT are distinct, it implies that the generalized model-checking problems 
for FO are not fixed-parameter tractable. 

Theorem 9 (Downey, Fellows, Taylor [10]). FO-Model-Checking is complete for 
AW[*] under parameterized reductions. 

So we are facing the situation that both the combined complexity and the parame- 
terized complexity of the generalized model-checking problems for first-order logic are 
very high. In the next section, we shall study restrictions on the class of formulas that 
make the problems tractable, and in Section 4 we shall study restrictions on the class C 
of structures. 

Remark 10. It is interesting to see what the relation among the complexities of the 
different generalized model-checking problems is. It can be shown [21] that under mild 
closure conditions on the class of formulas and the class C of structures, the following 
four statements are equivalent: (i) The combined complexity of <P-Model-Checking 
on C is in PTIME. (ii) The combined complexily of (^-CONSTRUCTION on C is in 
PTIME. (iii) The combined complexily of (^-EVALUATION on C is in PD. (iv) The 
combined complexily of (^-EVALUATION on C is in PTT. 

An analogous slalement holds with respect to parameterized complexity. 

In general, (^-COUNTING on C is a harder problem, as a suitable formalization of 
the Maximum Matching problem on bipartite graphs shows. 

3 Simple Formulas 

In this section we look for restrictions on the class <P C FO of formulas making a gener- 
alized model-checking problem tractable. A first idea is to restrict quantifier alternations 



As for the data complexity, there is also a non-uniform version of this definition, but for pa- 
rameterized complexity the uniform version is more common. 
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and just look at existential formulas. Conjunctive queries (see Example 3) are a good 
starting point because they are particularly simple existential formulas that are never- 
theless very important. As expected, model-checking problems for conjunctive queries 
are of lower complexity than those for full first-order logic, but unfortunately they are 
still not tractable: 

Theorem 11 (Chandra and Merlin [4], Papadimitriou and Yannakakis [26]). 

The combined complexity of CQ-Model-Checking is 'HV-complete, and the param- 
eterized complexity is W[l]-comp/ete. 

W[l] is a parameterized complexity class that plays a similar role in parameterized 
complexity theory as NP does in classical complexity theory. In particular, it is believed 
thatFPT ^ W[l]. 

To get tractable model-checking problems, we need to consider even simpler for- 
mulas, and it is not clear how they might look. A fruitful idea is to study the graph 
of a conjunctive query ip\ The vertex set of is the set of all variables of p, and there 
is an edge between variables x and y if, and only if, there is an atom a oi p that con- 
tains both X and y. The hope is that model-checking is easy for queries with a “simple” 
graph, and indeed this is true for the right notion of “simplicity”, which turns out to be 
“tree-likeness”, or more precisely, bounded tree-width.* * * For a class C of graphs, we 
let CQ(C)-denote the class of all conjunctive queries p with e C. 

Example 12. Let TREE denote the class of all trees. In this example, we consider the 
class CQ(TREE) of all conjunctive queries whose underlying graph is a tree. We shall 
prove the following: 

CQ(TREE)-Model-Checking can be solved in time 0(||.4|| • Hv^H)- 

Let . . . , Xfc) := 3xfc+i . . . 3x;(ai A . . . A ttm) G CQ(TREE)[t], and let ^ be 
a T-structure. Without loss of generality we assume that p contains no equalities. If 
Xi = Xj is an atom of p, we can just delete that atom and replace xj by Xi everywhere. 
The resulting formula is equivalent to p and still in CQ(TREE). 

The graph is a tree with universe T := {xi, . . . , xi}. We declare to be the root 
of this tree. We define the parent and the children of a node Xi in the usual way. Let T 
denote the directed tree with universe T and edge relation E'^ := {xy \ x parent of y}. 
We define the tree-order to be the reflexive transitive closure of Ef^ . 

For every node x G T, we let 5x be the conjunction of all atoms of p with 
free(ai) = {x}. For every edge xy G E'^, we let e^y be the conjunction of atoms at 
of p with free(o;i) = {x, y}. Then every atom of p occurs in precisely one 5x or Sxy 
Thus p is equivalent to the formula 3xfc+i . . .xi{ f\x(zT ^ f\xy(^EX ^xy) ■ 

Our algorithm is a straightforward dynamic programming algorithm. It starts by 
doing some pre-computations setting up the data structures needed in the actual dy- 
namic programming phase. For every vertex x G T it computes Sx{A) and stores it 
in a Boolean array with one entry for every a G A. This requires time 0(|A| • ||<)s||). 
Similarly, for every edge xy G E'^ it computes the set exy{A). Then for every a G A 
it produces a linked list that contains all b such that ab G exy{A). This can be done in 
time 0(||Al|| • Hexyll) (see [12] for details). 

* * *For graph theoretic notions such as tree-width or minors that are left unexplained here, we 
we refer the reader to P]. 




20 



Martin Grohe 



Thus the overall time required for these pre-computations is 0(| |^| | • | |v?| |)- 
Let uo & T and let yi, ... ,yp all descendants of yo in T, that is, all nodes x e 
T \ {t/o} such that yo x. The subtree-formula ofyo is the formula 

^Vo /\ ^ /\ 

o<i<p 

ViVj&E'^ 



Note that free(CTyo) = {i/o}- 

Now the dynamic programming phase starts. Inductively from the leaves to the 
root of T, for every x G T the algorithm computes the set ax (^) and stores it in a 
Boolean array. For the leaves x of T, we have ax = Sx. Since the algorithm has already 
computed Sx{A), there is nothing to do. For a node x with children yi,. . . ,yq we let 
S'o := 5x{A) and, for i > 1, S'i := {a G Si-\ \ 3h G A \ b G ay. (A) and ab G 
^xy{A)}. Using the arrays for Sx{A) and ay^{A) and, for every a G ax(A), the list of 
all b such that ab G £x{A), it is easy to see that Si can be computed from Si-i in time 
O(ll^ll), and thus Sq = ax{A) can be computed in time 0{q ■ ||yl||). 

Hence the overall time required in the dynamic -programming phase is 0(||^|| • 
|T|) codicil- 11(^11). 

Remember that xi is the root of T and observe that axi (.4) is the set of all a\ G A 
for which there exist 02, . . . , o; € A such that (m, . . . , a;) G ( Ati ^i) (A) and thus 
(oi, . . . , Ofe) G ip(A). This implies (fi(A) f 0 if, and only if, ax^ (-4) f 0 . 

Therefore, our algorithm returns True if ax^ (.4) A 0 ^nd False otherwise. 

So model-checking for formulas whose underlying graph is a tree is tractable. The 
algorithm of the previous example can easily be extended to formulas whose underlying 
graph has bounded tree-width. Theorem 14 shows that this is essentially all we can do; 
for conjunctive queries whose underlying graph is more complicated, model-checking 
gets intractable. 

Theorem 13 (Chekuri and Rajaraman [5]). Let C be a class of graphs of bounded 
tree-width. Then the combined complexity of CQ(C)-Model-Checking is in PTIME, 
and the combined complexity CQ(C)-EVALUATlON is in PTT. 

It is not hard to see that the combined complexity of CQ(C)-CONSTRUCTlON and 
CQ(C)-Counting for classes C of graphs of bounded tree -width is also in PTIME and 
that the combined complexity CQ(C)-E VALUATION is actually in PD [21]. Of course 
these result imply that the parameterized complexity of the respective problems is in 
FPT or FPTD. 

Theorem 14 (Grohe, Schwentick, and Segoufln [22]). 

(1) Let C be a class of graphs of unbounded tree-width that is closed under taking 
minors. Then CQ(C)-Model-Checking is 'HV-complete. 

(2) LetC be a class of graphs of unbounded tree-width. Then CQ(C)-Model-Check- 
ING is W\Y\-complete. 

Instead of a graph, we may also associate a hypergraph with a conjunctive query in 
a natural way. It turns out that the generalized model-checking problems also become 
tractable for conjunctive queries with tree-like hypergraphs. In a fundamental paper that 
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is underlying all the work described in this section, Yannakakis [33] proved that the 
combined complexity of Model-Checking for conjunctive queries with an acyclic 
hypergraph is in PTIME, and the combined complexity of the EVALUATION problem 
for such queries is in PTT. As a matter of fact, the algorithm described in Example 12 
is essentially the one suggested by Yannakakis. As for graphs, the acyclicity restric- 
tion can also be relaxed for hypergraphs. Gottlob, Leone, and Scarcello [20] introduce 
the notion of bounded hypertree-width and show that conjunctive queries whose hy- 
pergraph has bounded hypertree-width have tractable model-checking problems. As a 
matter of fact, they show that Model-Checking for all the tree-like classes of con- 
junctive queries considered here is in the parallel complexity class LOGCFL and actu- 
ally complete for this class. 

If we look beyond conjunctive queries, there are two well-known classes of first- 
order formulas whose model-checking problems have a polynomial time combined 
complexity: The finite variable fragments of first-order logic and the guarded fragment. 
Surprisingly, these two fragments are closely related to the tree-like classes of conjunc- 
tive queries. A straightforward generalization of the class of conjunctive queries whose 
underlying graph has tree-width at most (/c -L 1) to full first-order logic yields the k 
variable fragment. Similarly, a generalization of the class of conjunctive queries whose 
underlying hypergraph is acyclic yields the guarded fragment [12]. 

4 Simple Structures 

In this section, we restrict the class C of input structures of first-order generalized 
model-checking problems. We first note that there is not much we can do about the com- 
bined complexity: The PSPACE-completeness of the Quantified Boolean For- 
mula problem implies that FO-Model-Checking is already PSPACE complete on 
the class {B} consisting of just one structure B with universe {0, 1} and one unary 
relation = {!}. So we concentrate on parameterized complexity here. 

4.1 Gaifman’s Locality Theorem 

An important property that distinguishes first-order logic from most stronger logics is 
that it can only express local properties. The model-checking algorithms that we shall 
consider in this section crucially depend on locality. 

The Gaifinan graph of a r-structure A is the graph with universe A that has an edge 
between distinct elements a,b & A if there is a relation symbol R G t and tuple d of 
elements of A such that d G R-^, and both a and b appear in d. The distance d-^(a, b) 
between two elements a, 6 G A in .4 is the length of the shortest path from a to 6 in the 
Gaifinan graph of A. For every r > 0 and a G A, the r-neighborhood of a in A is the set 
Nf-(a) := {b G A \ b) < r}. For a set P C A we let Nf-(B) := [Jbes 

It is easy to see that for every vocabulary r and every r > 0 there is a formula 
5r{x, y) G FO[t] such that for every r-structure A we have Sr (A) = {(a, b) G A“^ \ 
d-^{a,b) < r}. We write d{x,y) < r instead of Sr{x,y). For a sentence ip G FO[r] 
we let denote the relativization of p to Nr{x), that is, the formula obtained 

from ip by replacing every subformula of the form 3yip by 3y{d{x, y) < r A f) and 
every subformula of the form 'dyf by Vj/(d(a:, y) < r ^ tp). Here we assume, without 
loss of generality, that x does not occur in p. 




22 



Martin Grohe 



Observe that the formula is r-local in the following sense: For every 

T-structure A and for every a G Awe have 

^ h ^ {N;^{a)) h (*) 

Here {N^{a)) denotes the substructure induced by A on N^{a). 

Theorem 15 (Gaifman [16]). Every first-order sentence (p is equivalent to a Boolean 
combination of sentences of the form 

3xi...3xk{ l\ d{xi,Xj)>2rA f\ (xi)) , (**) 

l<i<j<k l<i<k 



where k,r > 0 and ip is a first-order sentence. 

Furthermore, such a Boolean combination can be effectively computed from p. 

4.2 Structures of Low Degree 

In this subsection we generalize a theorem due to Seese [27] stating that first-order de- 
finable properties of structures of bounded degree can be checked in linear time. Our 
main purpose is to illustrate how Gaifman’s theorem can be used in model-checking 
algorithms. Seese’s proof is different, it uses Hanf’s locality theorem instead of Gaif- 
man’s theorem. This technique cannot be used in order to prove Theorem 16. 

The degree of a structure A, denoted by deg(.4), is the maximal degree of a vertex in 
the Gaifman graph of A. We say that a class C of structures has low degree if for every 
e > 0 there is an integer Ng, such that for all ^ G C with |H| > iYg we have deg(.4) < 
For example, the class of all structures whose degree is at most logarithmic in 
their size has low degree. 

Theorem 16. There is an algorithm ^for FO-Model-Checking and a function f 
such that for every class C of structures that has low degree and for every e > 0 the 
runtime of 21 on an input {A, v?) G C x FO is in 0(/(| |) • |2l|^+®). 

Proof: By Gaifman’s Theorem, it suffices to find an algorithm that model-checks for- 
mulas p of the form (*x). So let 

p = 3xi...3xk{ f\ d{xi,Xj)>2rA /\ , 

l<i<j<k l<i<k 



and let ^ be a structure. 

Our model-checking algorithm first computes the set proceeding as fol- 

lows: For every a G H, it computes N^{a) and then it checks whether {N^{a)) |= 
By (*), this is equivalent to a G (A) . 

{Nf^{a)) \= is checked in a straightforward way; this requires time 

Hence the overall time needed to compute 'ip^'~^^\A) is at most 
for some constant c > 0. 

Let S := To decide whether A [= p, it remains to check whether 

there are oi, . . . , o/c G S such that d-^{ai,aj) > 2r for 1 < i < j < r. A simple 
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algorithm doing this is described in [15]: It starts by picking an arbitrary oi € S' (if S 
is empty, the algorithm immediately rejects). Having picked ai, ... ,ai G S of pairwise 
distance greater than 2r, the algorithm tries to find an o;+i G S of distance greater 
than 2r from ai, . . . ,ai. Either it will eventually find oi, . . . , Ofc G S and accept, or 
after having found oi, . . . , a; for some I < k it will get stuck. This means that S C 

, a;}). Noting that for all sets B C A and a,b € i? we have d-^(a, b) > 

2r {a, b) > 2r, it now suffices to find out if there are a'^, . . . , aj, G S 

such that •••.“;})> (o', o') > 2r for 1 < i < fc. Our algorithm does this by first 

computing a distance matrix for . . . , ai}) and then exhaustively searching all 

fc-tuples. This requires time 0{k'^\\{N^{{ai, . . . Finding ai,.. .,ai 

requires time 0{k ■ ||-4||). 

A few straightforward computations show that on classes of input structures of low 
degree, our algorithm satisfies the requirements posed on its runtime. □ 

We note that on classes C of bounded degree, the algorithm 21 of Theorem 16 actu- 
ally runs in time linear in \A\, which implies Seese’s result mentioned above. 

Corollary 17. For every class C of structures of low degree, the parameterized com- 
plexity of FO-MODEL-CheCKING on C is in FPT. 

For every n > 1 and p G [0, 1], we let G{n,p) denote the probability space of all 
graphs with vertex set {1, . . . , n} and edge-probability p. We call a function p : N — > 
[0, 1] sparse ifp(n) G for all £ > 0. For example, the function p defined by 

p(n) = log(n)/n is sparse. 

Corollary 18. For every first-order sentence p there is an algorithm 21 that, given a 
graph Q, decides ifQ\=p such that for every sparse p : N ^ [0, 1] w have: 

For every e > 0, the average runtime of 21 on input Q G G(n, p{n)) is in 

Proof: We use the algorithm of Theorem 16. Some simple computations show that the 
probability that a graph Q G G{n,p{n)) has degree greater than is exponentially low 
(for every 6 > 0). Even on the few high-degree graphs in G{n,p{n)), the runtime is in 
ijOd I vl I ) ^ so we obtain a low average runtime. □ 



4.3 Tree- Width, Local Tree-Width, and Excluded Minors 

Model-checking does not only become tractable if the input formulas have a tree-like 
structure, but also if the input structures are tree-like. This is even true for monadic 
second-order logic MSO, which is much more powerful than FO. The underlying rea- 
son for this is that MSO-sentences on trees can be translated to tree-automata, and it 
is easy to check whether a given tree-automaton accepts a given tree. A well-known 
result due to Courcelle [7] states that MSO-Model-Checking on classes of struc- 
tures of bounded tree-width is possible in time 0(/(||ip||)|A|) (for a suitable function 
/) and therefore fixed-parameter tractable. Unfortunately, the function / is extremely 
fast-growing. It is non-elementary; essentially, it is a tower of 2s whose height is the 
number of quantifier-alternations of p. Amborg, Lagergren, and Seese [2] show that 
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MSO-Counting on classes of structures of bounded tree-width is possible in time 
<^(/(IIV5||)l^l)> Flum, Frick, and Grohe [12] show that MSO-CONSTRUCTION on 
such classes is in time 0(/(||(/?||)|2l|) and that MSO-EVALUATION is in (total) time 
0{f{\M)i\A\ + ||(/5(-4)||). Since FO C MSO, the corresponding results for general- 
ized model-checking problems for FO on classes of structures of bounded tree-width 
follow. 

Remembering that first-order logic is local, we observe that actually we do not 
need the whole input structures to have bounded tree-width in order to make model- 
checking tractable. It suffices to have structures that locally have bounded tree-width. 
Then we can apply Gaifinan’s Theorem to the input sentence, evaluate the local for- 
mulas (see (**)) using Courcelle’s approach, and put everything together 

as in the proof of Theorem 16. Let us make this precise: A class C of structures has 
bounded local tree-width if there is a function A : N ^ N such that for all structures 
Al G C, all a G A, and all r G N, the substructure {N^{a)) of A has tree-width at most 
A(r). Many interesting classes have bounded local tree-width, among them the class of 
planar graphs and more generally all classes of graphs of bounded genus, all classes of 
structures of bounded degree (but not all classes of low degree), and, trivially, all classes 
of structures of bounded tree-width. 

Theorem 19 (Frick and Grohe [15]). Let C be a class of structures of bounded local 
tree-width. Then the parameterized complexity of FO-Model-Checking on C is in 
FPT. More precisely, there is a function f and, for every s > 0, an algorithm that solves 
the problem in time 0(/(||(/?||) • |A|^“'''^). 

Requiring the class C to be locally tree-decomposable, which is slightly more re- 
strictive than just requiring it to have bounded local tree-width, we can actually find a 
model-checking algorithm that is linear in \ A\ [15]. All examples of classes of bounded 
local tree-width that we have seen above are actually examples of locally tree-decom- 
posable classes. In his forthcoming dissertation, Frick [14] is able to extend this result 
to the other generalized model-checking problem: He gives algorithms solving FO- 
CONSTRUCTION and FO-COUNTING in time linear in |A| and FO-EVALUATION in 
total time linear in ( | A | -|- 1 1 (/?( A) 1 1 ) . 

Building on the ideas of using locality and tree-decompositions to evaluate first- 
order formulas more efficiently, Flum and Grohe [13] showed that for every class 
C of graphs with an excluded minor, the parameterized complexity of FO-MODEL- 
Checking on C is in FPT. Here we say that a class C of graphs has an excluded minor 
if there is a graph H such that H is not a minor of any graph in C. 



5 Conclusions 

We have seen different approaches towards finding tractable instances for the gener- 
alized model-checking problems for first-order logic, which are known to be hard in 
general. 

All known fragments of FO that have a tractable model-checking problem can 
be characterized as being tree-like in some sense. Efficient model-checking for such 
classes can be done by a relatively simple dynamic programming algorithm that goes 
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back to Yannakakis [33]. Theorem 14 indicates that in some sense, these results are 
optimal. 

Model-checking problems for full first-order logic become fixed-parameter tractable 
on several interesting classes of input structures, among them classes of low degree and 
the class of planar graphs. The fixed-parameter tractable algorithms make crucial use 
of the locality of first-order logic. 

The results discussed in Section 4, in particular Theorem 19, imply a number of 
known results on the fixed-parameter tractability of more concrete problems on certain 
classes of graphs (for instance, the result that DOMINATING Set is fixed-parameter 
tractable on planar graphs [9]). The original proofs of these results are often very ad- 
hoc and vary a lot from problem to problem and for the different classes of structures. 
The results on FO-Model-Checking give a nice uniform explanation for all of these 
results. Moreover, they give us a simple way to see that a particular problem is fixed- 
parameter tractable, say, on the class of planar graphs, or, to give a fancier example, 
on the class of all graphs that have a knot-free embedding into Just show that the 
problem is first-order definable. 

The price we pay for this generality is that we obtain algorithms with huge hidden 
constants that are only of theoretical interest. But often it is a good starting point to 
have at least some fixed-parameter tractable algorithm for a particular problem (or just 
to know that such an algorithm exists) when designing one that is more practical. 



References 

1. A.V. Aho, J.E. Hopcroft, and J.D. Ullman. The Design and Analysis of Computer Algorithms. 
Addison- Wesley, 1974. 

2. S. Amborg, J. Lagergren, and D. Seese. Easy problems for tree-decomposable graphs. Jour- 
nal of Algorithms, 12:308-340, 1991. 

3. A. Chandra and D. Harel. Structure and complexity of relational queries. Journal of Com- 
puter and System Sciences, 25:99-128, 1982. 

4. A.K. Chandra and P.M. Merlin. Optimal implementation of conjunctive queries in relational 
data bases. In Proceedings of the 9th ACM Symposium on Theory of Computing, pages 
77-90, 1977. 

5. Ch. Chekuri and A. Rajaraman. Conjunctive query containment revisited. In Ph. Kolaitis 
and F. Afrati, editors, Proceedings of the 5th International Conference on Database Theory, 
volume 1 186 of Lecture Notes in Computer Science, pages 56-70. Springer- Verlag, 1997. 

6. J.H. Conway and C.McA. Gordon. Knots and links in spatial graphs. Journal of Graph 
Theory, 7:445^53, 1983. 

7. B. Courcelle. Graph rewriting: An algebraic and logic approach. In J. van Leeuwen, editor, 
Handbook of Theoretical Computer Science, volume 2, pages 194-242. Elsevier Science 
Publishers, 1990. 

8. R. Diestel. Graph Theory. Springer- Verlag, second edition, 2000. 

9. R.G. Downey and M.R. Fellows. Parameterized Complexity. Springer- Verlag, 1999. 

10. R.G. Downey, M.R. Fellows, and U. Taylor. The parameterized complexity of relational 
database queries and an improved characterization of JF[1]. In Bridges, Calude, Gibbons, 
Reeves, and Witten, editors, Combinatorics, Complexity, and Logic - Proceedings ofDMTCS 
'96, pages 194-213. Springer- Verlag, 1996. 

^ K-j is an excluded minor for this class [6]. 




26 



Martin Grohe 



11. T. Feder and M.Y. Vardi. Monotone monadic SNP and constraint satisfaction. In Proceedings 
of the 25th ACM Symposium on Theory of Computing, pages 612-622, 1993. 

12. J. Flum, M. Frick, and M. Grohe. Query evaluation via tree-decompositions. In Jan van den 
Bussche and Victor Vianu, editors. Proceedings of the 8th International Conference on 
Database Theory, Lecture Notes in Computer Science. Springer Verlag, 2001 . To appear. 

13. J. Flum and M. Grohe. Fixed-parameter tractability and logic. Submitted for publication. 

14. M. Frick. Easy Instances for Model Checking. PhD thesis, Albert-Ludwigs-Universitat 
Freiburg. To appear. 

15. M. Frick and M. Grohe. Deciding first-order properties of locally tree-decomposable struc- 
tures. Submitted for publication. A preliminary version of the paper appeared in Proceedings 
of the 26th International Colloquium on Automata, Languages and Programming, LNCS 
1644, Springer- Verlag, 1999. 

16. H. Gaifman. On local and non-local properties. In Proceedings of the Herbrand Symposium, 
Logic Colloquium ’81. North Holland, 1982. 

17. M.R. Garey and D.S. Johnson. Computers and Intractability: A Guide to the Theory of 
NP-Completeness. Freeman, 1979. 

18. L.A. Goldberg. Efficient Algorithms for Listing Combinatorial Structures. Cambridge Uni- 
versity Press, 1993. 

19. G. Gottlob, N. Leone, and F. Scarcello. A comparison of structural CSP decomposition meth- 
ods. In Thomas Dean, editor. Proceedings of the Sixteenth International Joint Conference 
on Artificial Intelligence, pages 394-399. Morgan Kaufmann, 1999. 

20. G. Gottlob, N. Leone, and F. Scarcello. Hypertree decompositions and tractable queries. In 
Proceedings of the ISthACM Symposium on Principles of Database Systems, pages 21-32, 
1999. 

21. M. Grohe. The complexity of generalized model-checking problems. In preparation. 

22. M. Grohe, T. Schwentick, and L. Segoufin. When is the evaluation of conjunctive queries 
tractable, 2000. Submitted for publication. 

23. M.R. Jerrum, L.G. Valiant, and VV Vazirani. Random generation of combinatorial structures 
from a uniform distribution. Theoretical Computer Science, 42:169-1^^, 1986. 

24. D.S. Johnson, C.H. Papadimitriou, and M. Yannakakis. On generating all maximal indepen- 
dent sets. Information Processing Letters, 27:1 19-123, 1988. 

25. Ph.G. Kolaitis and M.Y. Vardi. Conjunctive-query containment and constraint satisfaction. 
In Proceedings of the I7thACM Symposium on Principles of Database Systems, pages 205- 
213, 1998. 

26. C.H. Papadimitriou and M. Yannakakis. On the complexity of database queries. In Proceed- 
ings of the 17th ACM Symposium on Principles of Database Systems, pages 12-19, 1997. 

27. D. Seese. Linear time computable problems and first-order descriptions. Mathematical 
Structures in Computer Science, 6:505-526, 1996. 

28. L.J. Stockmeyer. The Complexity of Decision Problems in Automata Theory. PhD thesis. 
Department of Electrical Engineering, MIT, 1974. 

29. L.G. Valiant. The complexity of combinatorial computations: An introduction. In G1 
8. Jahrestagung Informatik, Lachberichte 18, pages 326-337, 1978. 

30. P. van Emde Boas. Machine models and simulations. In J. van Leeuwen, editor. Handbook 
of Theoretical Computer Science, volume 1, pages 1-66. Elsevier Science Publishers, 1990. 

31. M.Y. Vardi. The complexity of relational query languages. In Proceedings of the 14th ACM 
Symposium on Theory of Computing, pages 137-146, 1982. 

32. M.Y. Vardi. Constraint satisfaction and database theory: A tutorial. Proceedings of the 
19th ACM Symposium on Principles of Database Systems, pages 76-85, 2000. 

33. M. Yannakakis. Algorithms for acyclic database schemes. In 7th International Conference 
on Very Large Data Bases, pages 82-94, 1981. 




Myhill-Nerode Relations on Automatic Systems and the 
Completeness of Kleene Algebra 



Dexter Kozen 

Department of Computer Science 
Cornell University, Ithaca, NY 14853-7501, USA 
kozenScs . Cornell . edu 



Abstract. It is well known that finite square matrices over a Kleene algebra again 
form a Kleene algebra. This is also true for infinite matrices under suitable restric- 
tions. One can use this fact to solve certain infinite systems of inequalities over 
a Kleene algebra. Automatic systems are a special class of infinite systems that 
can be viewed as infinite-state automata. Automatic systems can be collapsed us- 
ing Myhill-Nerode relations in much the same way that finite automata can. The 
Brzozowski derivative on an algebra of polynomials over a Kleene algebra gives 
rise to a triangular automatic system that can be solved using these methods. This 
provides an alternative method for proving the completeness of Kleene algebra. 



1 Introduction 

Kleene algebra (KA) is the algebra of regular expressions. It dates to a 1956 paper of 
Kleene [7] and was further developed in the 1971 monograph of Conway [4]. Kleene 
algebra has appeared in one form or another in relational algebra [16,20], semantics and 
logics of programs [8,17], automata and formal language theory [14,15], and the design 
and analysis of algorithms [1,6,9]. Many authors have contributed over the years to the 
development of the algebraic theory; see [11] and references therein. There are many 
competing definitions and axiomatizations, and in fact there is no universal agreement 
on the definition of Kleene algebra. 

In [10], a Kleene algebra was defined to be an idempotent semiring such that a*b 
is the least solution to & -I- ax < a; and ba* the least solution to & -f xa < x. This is a 
Unitary universal Horn axiomatization (universally quantified equations and equational 
implications). These axioms were shown in [10] to be sound and complete for the equa- 
tional theory of the regular sets, improving a 1966 result of Salomaa [19]. Salomaa’s 
axiomatization is sound and complete for the regular sets, but his axiom for * involves a 
nonalgebraic side condition that renders it unsound over other interpretations of impor- 
tance, such as relational models. In contrast, the axiomatization of [10] is sound over a 
wide variety of models that arise in computer science, including relational models. No 
finitary axiomatization consisting solely of equations exists [18]. 

Matrices over a Kleene algebra, under the proper definition of the matrix operators, 
again form a Kleene algebra. This fundamental construction has many applications: the 
solution of systems of linear inequalities, construction of regular expressions equivalent 
to a given finite automaton, an algebraic treatment of finite automata in terms of their 
transition matrices, shortest path algorithms in directed graphs. In [10] it is used to en- 
code algebraically various combinatorial constructions in the theory of finite automata. 
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including determinization via the subset construction and state minimization via the 
formation of a quotient modulo a Myhill-Nerode relation (see [5,12]). A key theorem 
of Kleene algebra used in both these constructions is 

ax = xb ^ a*x = xb* . (1) 

Intuitively, x represents a transformation between two state spaces, and a and b are tran- 
sition relations of automata on those respective state spaces. The theorem represents a 
kind of bisimulation relationship. The completeness proof depends on the uniqueness of 
minimal deterministic automata: given two regular expressions representing the same 
regular set, it is shown how the construction of the unique minimal deterministic au- 
tomaton can be carried out purely algebraically and the equivalence deduced from the 
axioms of Kleene algebra. 

In this paper we give a new proof of completeness that does not depend on the 
uniqueness of minimal automata. Our approach is via a generalization of Myhill-Nerode 
relations. We introduce automatic systems, a special class of infinite systems that can be 
viewed as infinite-state automata. Automatic systems can be collapsed using Myhill- 
Nerode relations in much the same way that finite automata can. Again, the chief prop- 
erty describing the relationship between the collapsed and uncollapsed systems is (1). 
The Brzozowski derivative [3] on an algebra of polynomials over a Kleene algebra gives 
rise to a triangular automatic system that can be solved using these methods. Com- 
pleteness is proved essentially by showing that two equivalent systems have a common 
Myhill-Nerode unwinding. 

2 Kleene Algebra 

Kleene algebra was introduced by S. C. Kleene (see [4]). We define a Kleene algebra 
to be an idempotent semiring such that a*b is the least solution to b+ ax < x and ba* 
the least solution to b + xa < x. This axiomatization is from [10], to which we refer 
the reader for further definitions and basic results. 

The free Kleene algebra on a finite set of generators E is normally constructed 
as the set of regular expressions over E modulo the Kleene algebra axioms. This is the 
same as 2 [27], the algebra of Kleene polynomials over indeterminates 27, where 2 is the 
two-element Kleene algebra. As shown in [10], is isomorphic to Reg^;, the Kleene 
algebra of regular sets of strings over 27. 

The evaluation morphism e : 2[27] ^ 2, where e(a) = 0 for a G 27, corresponds 
to the empty word property (EWP) discussed by Salomaa [2,19]. This map satisfies the 
property that e{(i) = 1 if 1 < /3, 0 otherwise. 

3 Generalized Triangular Matrices 

Let A be a set and < a preorder (reflexive and transitive) on A. The preordered set A is 
finitary if all principal upward-closed sets Aq {/3 G A | a < /?} are finite. 

A {generalized) triangular matrix on a finitary preordered set A over a Kleene al- 
gebra AT is a map e \ A^ ^ K such that ea^p = 0 whenever a ^ fi. The family of 
generalized triangular matrices on A over K is denoted Mat(A, K). 
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There are several ways this definition generalizes the usual notion of triangular ma- 
trix. Ordinarily, the index set is finite and totally ordered, usually {1, . . . , n} with its 
natural order, and triangular is defined with respect to this order. In the present devel- 
opment, the index set A can be infinite and the order can be any finitary preorder. There 
can be pairwise incomparable elements, as well as “loops” with distinct elements a, (3 
such that a < P and P < a. 

Nevertheless, the restrictions we have imposed are sufficient to allow the definition 
of the usual matrix operations on Mat(A, K). For e, / G Mat(Gl, K), let 

{s+ f)a,0 = ^a,0 + fa,0 
{ef)a,0 ea,^f^,0 

7 

Because A is finitary, the sum in the definition of matrix product is finite. It is not 
difficult to verify that the structure Mat( 2 l, K) forms an idempotent semiring under 
these definitions. 

Now we wish to define the operator * on Mat(Gl,iT) so as to make it a Kleene 
algebra. That A is finitary is elemental here. We define Ca,0 to be (e f where 

e f Aa is the restriction of e to domain A^. Since Aa is finite, e f Aa is a finite square 
submatrix of e, so (e f Aa)* exists. Actually, we could have restricted e to any finite 
upward-closed subset B C A containing a and gotten the same result. 

Formally, let 1 b denote the restriction of 1 to domain A x B, where B C A is 
upward-closed. The restriction of e to domain B^ can be represented matricially by 
llelB. If B is finite, then e 1 b is a finite square matrix, therefore the * operator 
can be applied to obtain the matrix ( . We define 

e* sup 1 b (1b e IbT 1b^ (2) 

B 

where the supremum is taken over all finite upward-closed subsets B C AAt can be 
shown by elementary arguments that the value of the right-hand side of (2) at a,P is a 
constant independent of B if a G B and 0 if a ^ B. Since there is at least one finite 
upward-closed subset of A containing a (namely Aa), the supremum exists. 



- def 

la,/5 = 



l,ifa = P 
0, otherwise 



n n 

^a,0 = U. 



4 Infinite Systems of Linear Inequalities 

We can exploit the Kleene algebra structure of Mat(A, K) to solve triangular systems 
of linear inequalities indexed by the infinite set A. Such a system is represented by a 
triangular matrix e G Mat(A, K) and vector c : A ^ K as 

^ ^ ^a,0^0 “F Cq. ^ 3da, rr G A, 

0 

where is a vector of indeterminates. This is equivalent to the infinite matrix-vector 
inequality eX + c< X. 
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A solution of the system {A, e, c) over AT is a map a : A ^ K such that 

^ ^ ^a,l3^l3 “t“ "A: Ct G A, 

P 

or in other words eu + c < ct. As in the finite case, the unique least solution to this 
system is e*c. 

5 Automatic Systems 

We now focus on index sets A of a special form. Let A" be a finite set of functions acting 
on A. The value of the function a G E on a G A is denoted aa. Each finite-length string 
X G E* induces a function a; : A — > A defined inductively by 

def / X def / x 

ae = a a(xa) = (ax)a. 

Define a < /3 if /3 = ax for some x G E* . This is a preorder on A, and it is finitary iff 
for all a G A, the set Aq, = {ax \ x G E*} is finite. Since E is assumed to be finite, 
it follows from Kdnig’s lemma that A is finitary iff every <-chain ao < ai < • • • has 
only finitely many distinct elements; equivalently, for every a, every sufficiently long 
string X G E* has two distinct prefixes y and 0 such that ay = az. 

Now let e G Mat(A, K) be a triangular matrix and c : A — > AT a vector over A 
representing a triangular system of linear inequalities as described in the last section. 
Assume further that if /3 ^ an for any a G E, then 6 a ,/3 = 0. The system of inequalities 
represented by e and c is thus 

^ ^ ^a,aaEaa “b E Wq,, a G A. 

A linear system of this form is called automatic. This name is meant to suggest a gen- 
eralization of finite-state automata over Regj; to infinite-state systems over arbitrary 
Kleene algebras. One can regard A as a set of states and elements of E as input sym- 
bols. An ordinary finite-state automaton is essentially a finite automatic system over the 
Kleene algebra Reg^;. 

6 Myhill-Nerode Relations 

Myhill-Nerode relations are fundamental in the theory of finite-state automata. Among 
other applications, they allow an automaton to be collapsed to a unique equivalent min- 
imal automaton. Myhill-Nerode relations can also be defined on finitary automatic sys- 
tems. 

Given a finitary automatic system S = (A, e, c), an equivalence relation = on A 
is called Myhill-Nerode if the following conditions are satisfied: for all a,P G A and 

a G E, 

(i) if a = /3, then aa = /3a; 

(ii) if a = /3, then X)ab=ao ^a,ab = '^/ 3 b=/ 3 a ^P,0b’ 

(iii) if a = /3, then Cq = C/ 3 . 
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For any Myhill-Nerode relation = on S' = {A, e, c), we ean eonstruet a quotient system 
S/ = as follows: 

[a] {(3^ A\(3 = a} (e/ =)[„], [a]a Y.cb=ca 

[a]a [eta] (c/ =)[„] Cq, 

A/= ‘'=1' {[a] I a G A} S/= {A/= e/ = , c/=). 

The matrix e/ = and veetor c/= are well defined by the restrietions in the definition of 
Myhill-Nerode relation. The original system S ean be thought of as an “unfolding” of 
the eollapsed system S/ =. 

The set S aets on Aj = by [a]a [aaj. This is well defined by elause (i) in 
the definition of Myhill-Nerode relation. The preorder < on A/ = is defined as in 
Seetion 5. This relation is easily eheeked to be reflexive, transitive, and finitary on 
A/ =. Moreover, the matrix e/= is triangular. Thus S/ = is an automatie system. 

We now deseribe the relationship between the solutions of the systems S and S/ =. 
First, any solution of the eollapsed system S/ = ean be lifted to a solution of the original 

def 

system S'. If ct : A/ = ^ Ff is a solution of S/ =, define d \ A ^ Khy da = cr^a] ■ It 
is easily verified that ir is a solution of S: 

^ ^ ^a,aa^aa “F ^ ^ ^a^aa^[aa] “F 

~ 'y ' (6/=)fg] Jaaj^fag] “F (c/=)[q,] 

— ^[q] 

It is more diffieult to argue that a is the least solution to S. The unfolded system S is 
less eonstrained than S/ =, and it is eoneeivable that a smaller solution eould be found 
in whieh different but =-equivalent a, (3 are assigned different values, whereas in the 
eollapsed system S/ =, a and (3 are unified and must have the same value. We show 
that this eannot happen. 



Example 1. Consider the 2x2 system 



oY + c<X 
aX + c<Y. 



This is represented by the matrix-veetor equation 



0 a 




'W' 




c 


< 


'w' 


a 0 




Y 


+ 




Y 






c 
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We can collapse this system by a Myhill-Nerode relation to the single inequality aX + 
c< X. The least solution of the 2x2 system is given by 



' X' 




0 a 


* 


c 


Y 




a 0 




c 



(aa)* {aa)*a 
{aa)*a {aa}* 



(aa)*c + {aa)*ac 
{aa)*ac + {aa)*c 



c 

c 



which is the same as that obtained by lifting the least solution a*c of the collapsed 
system aX + c< X. 

We show that in general, the least solution of S is obtained by lifting the least 
solution of S/ =. Define x : ^ x ^ Khy 

(fef J 1, if a = /3 

— I 0 , otherwise. 

The matrix x is called the characteristic matrix of =. To lift a solution from S/ = to S, 
we multiply it on the left by x; thus in the above example, a = x^. 

Now for any a, 7 , 



(^X)q,[ 7] ^ ^ ^a,aaXaa,[7] 

Ota 

aa=7 

= (e/=)[a],[7] 

= Xl^“.[/9](e/=)[/3],[7] 

m 

~ (x(6/=))a,[7]j 

therefore ex = x(e/=)- By (1) (see [13]), e*x = x(e/=)*- Since c = x(c/=), we 
have 



e*c=e*x(c/^) = x(e/^)*(c/^), 

which shows that the least solution e*c of S' is obtained by lifting the least solution 
(e/^)*(c/^) ofS/ = 



7 Brzozowski Derivatives 

For X G E*, the Brzozowski derivative was originally defined by Brzozowski [3,4] as 
a map 2 ^ ^ 2 ^ such that 

Dx{A) {y G S* \xy G A}] 
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that is, the set of strings obtained by removing x from the front of a string in A. It 
follows from elementary arguments that Dx{A) is a regular set if A is. 

Here we wish to consider as an operator on Without knowing that — 
Regj;, we could have defined on Ju inductively as follows. For a & S, 

Da{Q) = Da{l) = Da{h)=Q, h^a 
Da{o) = 1 

Da{a + (})=Da{a)+Da{l3) 

Da{af3) Da{a)f3 + s{a)Da{P) (3) 

D,{a)a\ 

where e : ^ 2 is the evaluation morphism e{a) = 0, a G S. We then define 

inductively 



D,{a) a D,a{a) = D^{D,{a)). 

This definition agrees with Brzozowski’s on Reg^; [3]. However, we must argue ax- 
iomatically that it is well defined on elements of S' s', that is, if a = (3 is a theorem of 
Kleene algebra, then Da{a) — Da{f3). This can be done by induction on the lengths 
of proofs. We argue the case of the Horn axiom «7 + /3<7^q;*/ 3<7 explicitly. 
Suppose we have derived a* (3 < 7 by this rule, having previously proved 0:7 + /3 < 7 . 
By the induction hypothesis, we have Da{aj + (3) < Da{j) and we wish to prove that 
Da{a*P) < Da{j). 



Da{a*P)=Da{a*)(3 + e{a*)Da{(3) 

= Da{a)a*(3 + Da{!3) 

< Da{a)') + Da{!3) 

< Da{a)^ + e{a)Da{i) + Da{!3) 

= Da(aj + (3) 

< Da{-i). 

The following lemmas list some basic properties of Brzozowski derivatives. All of 
these properties are well known and are easily derived by elementary inductive argu- 
ments using the laws of Kleene algebra. 

Lemma 1. Let R : Ss — *■ Reg 2 ; be the canonical interpretation R(a) = {a}. 

(i) Fora G S3, aDa{P) < f3; 

(ii) //I < [3, then for m> n = |x|, D^{(3^) = 

(iii) Forn = \x\, Dx{a*) = Dx{{l + «)”)«*; 

(iv) D^{aP) = Dj{a)(3 + Y.x=yz s{Dy{a))D^{(3); 

(v) e{D^{aP)) = T.x=yz^{Dy{a)D^{l3)); 

(vi) D^{a*) = D^{1) + D^{a)a* + Y.^=yz e{Dy{a))D^{a*); 

(vii) X G R{a) iff e{D^{a)) = 1. 
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Proof. All follow by elementary inductive arguments from the definition of and the 
laws of Kleene algebra. We prove (vii) explicitly. Proceeding by induction on a, the 
base cases a = 0, 1, or a G S are immediate. For expressions of the form a + (3, the 
result follows from the linearity of R, e, and D^. For the other compound expressions, 

X G R{aj3) 3y, z x = yz A y G R{a) A z G R{!3) 

3y,z X = yz A e{Dy{a)) = 1 A £{D^{f3)) = 1 

^ ^ s{Dy{a)DM) = 1 

x^yz 

e{D^{aP)) = 1 by (v); 



X G R{a*) X G i?((l + a)”), where n = |x| 

^£p,((l + a)”)) = 1 

+ «)"))£(«*) = 1 
+ «)")«*) = 1 
£{D^{a*)) = 1 by (iii). 



8 Brzozowski Systems 

A class of automatic systems can be defined in terms of Brzozowski derivatives. We 
take the set A in Section 5 to be and define the action of a G F7 on as Da- That 

def 

is, for all a G 3^s, aa = Da{a). We must argue that the induced preorder is finitary. 
The proof of Brzozowski (see [4]) depends on the interpretation Reg^;, but we must 
argue axiomatically. 

Lemma 2. For any a, the set {ax \ x G = {Dx{a) \ x G is finite. 

Proof. The proof proceeds by induction on a. For a of the form 0, 1, or a G E, the 
result is easy. For a + /3, the result follows from the linearity of and the induction 
hypothesis. For aj3, the result follows from Lemma l(iv) and the induction hypothesis. 
Finally, for a*, the result follows from Lemma l(vi) and the induction hypothesis. 

Now consider the system S = (Ti;, e, c), where 

def , def / x 

^a,aa — / ^ ^ 

oth—aa 

We call this system the Brzozowski system on E. The least solution of this system 
over is f = e*c. The key property that we need is that £, considered as a map 
£ : — *■ is a homomorphism. We show in fact that £ is t, the identity on 3^x- 

Lemma 3. The identity map l : a ^ a is the least solution to the Brzozowski system. 

Proof. First we show that f < t. It suffices to show that 6 is a solution to S. We must 
argue that for all a G 

aDa(a) + e{a) < a. 
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But this is immediate from Lemma l(i) and the property e(/3) < (3 noted in Seetion 2. 

Now we show that t is the least solution to S. The major portion of the work is 
involved in showing that \i a < (3, then < ^/ 3 - We use the Myhill-Nerode the- 
ory developed in Seetion 6 to find a common unwinding of the Brzozowski system S, 
allowing us to compare and 

First, lift the system S to the product Js x under each of the two projection 
maps to obtain two systems U = {7s x 7s, e, c) and V = {7s x 7s, e, d), where 

def \ '' j def / x , def / 

^(7,(5),(7,(5)a " / ^ ^ ^7)<5 ^(T) ^ 7,6 ^(^)* 

'yb^'ya 

5b=5a 



The relations defined by the two projections, 

(7, =1 (7', < 5 ') ^ 7 = 7' (7, < 5 ) =2 (7', < 5 ') ^ 

are Myhill-Nerode. 

Now restrict these systems to the finite induced subsystems on 

(Ti; X 7s){a,0) = {{ax,Px) \ x£ S*} 

to obtain U' = ((ITi; x 7s){a,i3),e' ,d) and V = ((ITi; x 7s){a,f3),d ,d'), where e', 
c', and d' are e, c, and d, respectively, restricted to (Ji; x 7s)(a,0)- The least solution 
of U' is e'*c' and the least solution of V is e'*d' . Moreover, by linearity, e{Dx{a)) < 
£{Dx{(3)) for all x £ S*, therefore c' < d' and 

= {e'*c')a,l3 < {e'*d')a,i3 = ^0- 

We have shown that a < (3 implies £a < f/?. It follows that 

£a 7 i/3 7: ia+/3- (4) 

Now we show that a < ia for all a by induction on a. We actually show by 
induction that aijs < iap for all a and (3 by induction on a. 

For atomic expressions, we have 



io/3 = fo = 0 = Qi/3', 
il/3 = i/3 = ^i/3', 

~ ^ ^^Da{bf3) + ^{bf3) 

aeS 

= ^^D^(b)/3 

= biDt{b)(3 

= bi/3, be E. 
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For compound expressions, 

{a + 'y)£p = a£0 + 'yip 

< iap + i-yp by the induction hypothesis 

^ i{a+'f)p by (4), 

oi'yip < ai-yp by the induction hypothesis on 7 

< la-yp by the induction hypothesis on a. 

Finally, to show a* Ip < ia*p, by an axiom of Kleene algebra it is enough to show 

Ip + aic-p < la*p- We have 

Ip + aia-p < ip + iaa*p by the induction hypothesis 

< ^/3+ aa* j3 by (4) 

ioL*p- 

Thus ia < ct since t is a solution and i is the least solution, and a < i^hy taking 
/3 = 1 in the argument above, therefore i^ = a. 

9 Completeness 

The completeness result of [10], which states that the free Kleene algebra and the 
Kleene algebra of regular sets Regj; are isomorphic, follows from the considerations 
of the previous sections. Let R : Ti; — > Reg^; be the canonical interpretation in which 
R{a) = {a}. If R{a) = R{(i), then for all x G E* , x G R{a) iff x G R{(i), therefore 
by Lemma l(vii), e{Dx{a)) = e{Dx{P)). This says that the common unwinding of the 
Brzozowski system S on Ju x Js restricted to {ifs x T”!;) (^a,p) gives identical systems, 
therefore their solutions are equal. In particular, i^ = ip.Hy Lemma 3, a = f3. 
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A set of Parikh vectors is regular if it is L{a) for some a. The family of all regular sets 
of Parikh vectors forms a commutative Kleene algebra under the above operations. We 
denote this algebra by Par„. 

The completeness result follows from a characterization due to Redko (see [4]) 
of the equational theory of Par„ as the consequences of a certain infinite but easily- 
described set of equations, namely the equational axioms for commutative idempotent 
semirings plus the equations 

(x + yf = {x*y)*x* x**=x* 

{xy)* X = x{yx)* x*y* = {xy)*{x* + y*) 

x* = l + xx* X* = (x’”)*(l + m > 1. 

All these are theorems of commutative Kleene algebra. 

The proof of Redko, as given in [4], is quite involved and depends heavily on com- 
mutativity. We began this investigation in a attempt to give a uniform completeness 
proof for both the noncommutative and commutative case. Our hope was to give a 
simpler algebraic proof along the lines of [10] for commutative Kleene algebra, al- 
though the technique of [10] does not apply directly, since minimal automata are not 
unique. For example, the three-state deterministic automata corresponding to the ex- 
pressions {ah)* and {ha)* are both minimal and represent the same set of Parikh vectors 
{(m, m) I m > 0}. The usual construction of the canonical deterministic automaton 
directly from the set itself (see [12, Lemma 16.2]) yields infinitely many states. 

Nevertheless, one can define the free commutative Kleene algebra Qs on generators 
E and attempt to show that L, factored through Cu, gives an isomorphism Cu ^ Par„. 
The Brzozowski derivatives Da : Cu ^ Qs are defined differently on products in the 
commutative case: 



Da{a(i) = Da{a)(3 + aDa{(}). 

The action of Da on other expressions is as defined in Section 7. As in that section, 
we can argue that Da respects the axioms of Kleene algebra. Here we must also show 
that it respects the commutativity axiom; in other words, Da{a(i) = Da{(ia). Also, 
for any x, y € S* , Dxy{a) = Dyx{a). Unfortunately, the principal upward closed sets 
{Dx{a) I X G E*} are not necessarily finite, and it is not clear how to define a Kleene 
algebra structure of infinite matrices as in Section 3. Nevertheless, the set {Dx{a) \ 
X G E*} does exhibit a regular (n — 1) -dimensional linear geometric structure which 
is respected by the action of the Brzozowski derivatives. It remains a topic for future 
investigation to see how this structure can be exploited. 
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Abstract. 2-nested simulation was introduced by Groote and Vaan- 
drager [10] as the coarsest equivalence included in completed trace equiv- 
alence for which the tyft/tyxt format is a congruence format. In the linear 
time-branching time spectrum of van Glabbeek [8], 2-nested simulation 
is one of the few equivalences for which no finite equational axiomati- 
zation is presented. In this paper we prove that such an axiomatization 
does not exist for 2-nested simulation. 



1 Introduction 

Labelled transition systems (LTSs) [11] are a fundamental model of concurrent 
computation, which is widely used in light of its flexibility and applicability. In 
particular, they are the prime model underlying Plotkin’s Structural Operational 
Semantics [18] and, following Milner’s pioneering work on CCS [14], are by now 
the standard semantic model for various process description languages. 

LTSs model processes by explicitly describing their states and their transi- 
tions from state to state, together with the actions that produced them. Since 
this view of process behaviours is very detailed, several notions of behavioural 
equivalence and preorder have been proposed for LTSs. The aim of such be- 
havioural semantics is to identify those (states of) LTSs that afford the same 
“observations”, in some appropriate technical sense. The lack of consensus on 
what constitutes an appropriate notion of observable behaviour for reactive sys- 
tems has led to a large number of proposals for behavioural equivalences for con- 
current processes. (Cf. the encyclopaedic study [8], where van Glabbeek presents 
the linear time-branching time spectrum — a lattice that contains all the known 
behavioural equivalences and preorders over LTSs, ordered by inclusion.) 

One of the criteria that has been put forward for studying the mathematical 
tractability of the behavioural equivalences in the linear time-branching time 
spectrum is that they afford elegant, finite equational axiomatizations over frag- 
ments of process algebraic languages. Equationally based proof systems play an 



A. Ferreira and H. Reichel (Eds.): STAGS 2001, LNCS 2010, pp. 39—50, 2001. 
© Springer- Verlag Berlin Heidelberg 2001 




40 



Luca Aceto, Wan Fokkink, and Anna Ingolfsdottir 



important role in both the practice and the theory of process algebras. From the 
point of view of practice, these proof systems can be used to perform system 
verifications in a purely syntactic way, and form the basis of axiomatic verifi- 
cation tools like, e.g., PAM [12]. From the theoretical point of view, complete 
axiomatizations of behavioural equivalences capture the essence of different no- 
tions of semantics for processes in terms of a basic collection of identities, and 
this often allows one to compare semantics which may have been defined in very 
different styles and frameworks. A review of existing complete equational ax- 
iomatizations for many of the behavioral semantics in van Glabbeek’s spectrum 
is offered in [8] . The equational axiomatizations offered ibidem are over Milner’s 
Basic CCS (abbreviated to BCCS in what follows), a fragment of CCS suit- 
able for describing finite synchronization trees, and characterize the differences 
between behavioural semantics in terms of a few revealing axioms. 

The main omission in this menagerie of equational axiomatizations for the 
behavioural semantics in van Clabbeek’s spectrum is an axiomatization for 2- 
nested simulation semantics. 2-nested simulation was introduced by Croote and 
Vaandrager [10] as the coarsest equivalence included in completed trace equiva- 
lence for which the tyft/tyxt format is a congruence format. It thus characterizes 
the distinctions amongst processes that can be made by observing their termi- 
nation behaviour in program contexts that can be built using a wide array of 
operators. (The interested reader is referred to op. cit. for motivation and the 
basic theory of 2-nested simulation.) 2-nested simulation can be decided over 
finite LTSs in time that is quadratic in their number of transitions [21], and 
can be characterized by a single parameterised modal logic formula [15]. How- 
ever, as previously mentioned, no equational axiomatization for it has ever been 
proposed, even for the language BCCS. 

In this paper, we offer a possible mathematical justification for the lack of an 
equational axiomatization for the 2-nested simulation equivalence and preorder 
even for the language of finite synchronization trees. More precisely, we show that 
neither of these two behavioural relations has a finite equational axiomatization 
over the language of BCCS. These results hold in a very strong form. Indeed, 
we prove that no finite collection of inequations that are sound with respect to 
the 2-nested simulation preorder can prove all of the inequalities of the form 

^ (^ > 0 ) , 

which are sound with respect to the 2-nested simulation preorder. Similarly, we 
establish a result to the effect that no finite collection of equations that are sound 
with respect to 2-nested simulation equivalence can be used to derive all of the 
sound equalities of the form 

a(a2™ -k a™) « a(a2™ -k a™) -k (m > 0) . 

The import of these two results is that not only the equational theory of 2-nested 
simulation is not finitely equationally axiomatizable, but neither is the collection 
of (in)equivalences that hold between BCCS terms over one action and without 
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occurrences of variables. This state of affairs should be contrasted with the el- 
egant equational axiomatizations over BCCS for most of the other behavioural 
equivalences in the linear time-branching time spectrum that are reviewed by 
van Glabbeek in [8] . Only in the case of additional, more complex operators, such 
as iteration, are these equivalences known to lack a finite equational axiomati- 
zation; see, e.g., [3,6,7,19,20]. Of special relevance for concurrency theory are 
Holler’s results to the effect that the process algebras AGP and GGS (without 
the auxiliary left merge operator from [5]) do not have a finite equational axiom- 
atization modulo bisimulation equivalence [16,17]. Aceto, Esik and Ingolfsdottir 
[2] proved that there is no finite equational axiomatization that is w-complete for 
the max-plus algebra of the natural numbers, a result whose process algebraic 
implications are discussed in [1]. 

The paper is organized as follows. We begin by presenting preliminaries on 
the language BGGS and (in)equational logic (Sect. 2). We then proceed to define 
2-nested simulation, and study some of its basic properties that play a major role 
in the proof of our main results (Sect. 3). The definition of 2-nested simulation 
suggests a natural conditional inference system for it. This is presented in Sect. 4. 
Our main results on the non-existence of finite (in)equational axiomatizations 
for 2-nested equivalence and preorder are the topic of Sects. 5 and 6. The paper 
concludes with a result to the effect that the 3-nested simulation preorder has 
no finite inequational axiomatization, and some open problems (Sect. 7). 



2 Preliminaries 

The Language BCCS. The process algebra BGGS [14] is a basic formalism to 
express finite process behaviour. Its syntax consists of (process) terms that are 
constructed from a countably infinite set of variables (with typical elements 
x,y,z), a constant 0, a binary operator -|- called alternative composition, and 
unary prefixing operators a, where a ranges over some nonempty set Act of atomic 
actions. We shall use the meta-variables t, u, v to range over process terms, and 
write var(t) for the collection of variables occurring in the term t. 

A process term is closed if it does not contain any variables. Glosed terms 
will be typically denoted by p,q,r. Intuitively, closed terms represent completely 
specified finite process behaviours, where 0 does not exhibit any behaviour, p + q 
combines the behaviours of p and q, and ap can execute action a to transform 
into p. This intuition for the operators of BGGS is captured, in the style of 
Plotkin [18], by the transition rules in Table 1. These transition rules give rise to 
transitions between process terms. The operational semantics for BGGS is thus 
given by the labelled transition system [11] whose states are terms, and whose 
Act-labelled transitions are those that are provable using the rules in Table 1. 

A (closed) substitution is a mapping from process variables to (closed) BGGS 
terms. For every term t and (closed) substitution cr, the (closed) term obtained 
by replacing every occurrence of a variable x in t with the (closed) term cr(a:) 
will be written a{t). 




42 



Luca Aceto, Wan Fokkink, and Anna Ingolfsdottir 



Table 1. Transition Rules for BCCS 




Table 2. Axioms for BCCS 



Al 


x + y^ 


^y + x 


A2 


ix + y)+zK 


zx + {y + z) 


A3 


X + X 


^ X 


A4 


a; -1- 0 R 


^ X 



In the remainder of this paper, process terms are considered modulo associa- 
tivity and commutativity of -I-, and modulo absorption of 0 summands. In other 
words, we do not distinguish t+u and u+t, nor (t+u)+v and t+{u+v), nor t-l-0 
and t. This is justified because all of the behavioural equivalences we consider 
satisfy axioms Al, A2 and A4 in Table 2. In what follows, the symbol = will 
denote syntactic equality modulo axioms Al, A2 and A4. We use a summation 
Si6{i fc} to denote ti + • • • + tk, where the empty sum represents 0. It is 
easy to see that, modulo the equations Al, A2 and A4, every BCCS term t has 
the form to'' some finite index sets I, J, terms tj (j € J) 

and variables Xi {i G I). 

Equational Logic. An axiom system is a collection of (in)equations over the 
language BCCS. We say that an equation t ~ u (resp. an inequation t < u) 
is derivable from an axiom system E if it can be proven from the axioms in E 
using the standard rules of equational (resp. inequational) logic. It is well-known 
(cf., e.g.. Sect. 2 in [9]) that if an (in)equation relating two closed terms can be 
proven from an axiom system E, then there is a closed proof for it. 

In the proofs of our main results (cf. Thms. 3 and 4), it will be convenient to 
use a different formulation of the notion of provability of an (in) equation from 
a set of axioms. This we now proceed to define for the sake of clarity. 

A context C[] is a closed BCCS term with exactly one occurrence of a hole [] 
in it. For every context C[] and closed term p, we write C[p] for the closed term 
that results by placing p in the hole in C[]. It is not hard to see that an equation 
p « q is provable from an equational axiom system E iff there is a sequence 
pi^ ■ ■ ■ Ki pf. (fc > 1) such that 

- p = pi, q = Pk and 

— Pi = C[a{t)] « C[a{u)] = pi+i for some closed substitution a, context C[] 
and pair of terms t, u with either t « rt or m « t an axiom in if {1 < i < k). 

The obvious modification of the above observation applies to proofs of inequa- 
tions from inequational axiom systems. In what follows, we shall refer to se- 
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quences of the form pi « • • • « (resp. < • • • < pk) as equational (resp. in- 
equationat) derivations. 

For later use, note that, using axioms Al, A2 and A4 in Table 2, every context 
can be proven equal to either one of the form C[6([] + p)] or to one of the form 
[] +p, for some action b and closed BCCS term p. 

3 2-Nested Simulation 

In this paper, we shall study the (in)equational theory of 2-nested simulation 
semantics over BCCS. This is a behavioural semantics for processes that stems 
from [10], where it was characterized as the largest congruence with respect to 
the tyft/tyxt format of transition rules which is included in completed trace 
semantics. 

Definition 1. A binary relation R between closed terms is a simulation iffp R q 
together with p p' implies that there is a transition q q' with p' R q' . 

For closed terms p, q, we write p £ q iff p R q with R a simulation. The 
kernel of 'A (i.e., the equivalence C n(C )~^) is denoted by 

The relation is the well-known simulation preorder [13]. 

2 

Definition 2. For closed terms p, q, we write p ^ q iffp R q with R a simula- 
tion and R~^ included in C . The kernel off (i.e., the equivalence f n(C )~^) 
is denoted by 

The relations and are the 2-nested simulation preorder and the 2-nested 
simulation equivalence, respectively. It is easy to see that C is included in 
In the remainder of this paper we will use, instead of Definition 2, the following 
more descriptive, fixed-point characterization of 2-nested simulation. To the best 
of our knowledge, this characterization is new. 

2 

Theorem 1. Let p, q be closed BCCS terms. Then p f q iff 

(1) for all p p' there is a q q' with p' q' , and 

(2) q p. 

The transition rules in Table 1 are in tyft/tyxt format, that is a (pre) congruence 
format for and [10]. Hence, we immediately have that: 

Lemma 1. The relations and are preserved by the operators of BCCS. 

The relations f? and are extended to arbitrary BCCS terms thus: 

Definition 3. Let t, u be BCCS terms. The inequation t f u is sound with 
respect to f iff aft) f a{u) holds for every closed substitution a. Similarly, 
the equation t ~ u is sound with respect to iff aft) a fa) holds for every 
closed substitution a. 

Examples of (in)equations that are sound with respect to C are those in Table 2 
and a{x -\- y) f a{x -\-y) -\- ax. 
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Table 3. Axiom for Simulation 



S X X + y 



Table 4. Axiom for 2-Nested Simulation 



2S {/ a; a; ^2 * + J/ 



Norm and Depth. We now present some results on the depth and the norm of 
BCCS terms that are related in 2-nested simulation semantics. These will find 
important applications in the proofs of our main results, and shed light on the 
nature of the identifications made by 2-nested simulation semantics. 

Definition 4. A sequence ai • • • G Act* (k > Q) is a termination trace of a 
term t iff there exists a sequence of transitions t = to ^ ti ^ tk with tk 

a term without outgoing transitions. 

Definition 5. The depth and the norm of a BCCS term t, denoted by depthft) 
and normft), are the lengths of the longest and of the shortest termination trace 
oft, respectively. 



2 

Lemma 2. If p f q, then 

1 . each termination trace of p is a termination trace of q; 

2. depth{p) = depth{q); and 

3. norm{p) > norm{q). 

4 A Conditional Axiomatization 

The definition of 2-nested simulation immediately suggests an implicational 
proof system for the 2-nested simulation preorder. It is folklore that the axioms 
in Tables 2 and 3 give a complete axiomatization of the simulation preorder over 
the language BCCS [8] . To obtain a complete inference system for the 2-nested 
simulation preorder, it is sufficient to add the conditional axiom in Table 4 to 
the axiom system in Table 2. In axioms S and 2S, the relation symbol refers 
to inequations that are provable using the proof system for the simulation pre- 
order, while the relation symbol $2 refers to inequations that are provable using 
the proof system for the 2-nested simulation preorder. Not too surprisingly, we 
have that: 

Theorem 2. A1-4+2S is sound and complete for BCCS modulo the 2-nested 
simulation preorder. 
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Proof. The soundness proof is left to the reader. We prove that A1-4+2S is 

2 

complete modulo the 2-nested simulation preorder. Suppose p £ q. We prove, 
by induction on the depth of p, that p ~ q can be derived from A1-4+2S. 

Let p = 9 = ^jeJ Since p f q, for every i G I there is a 

2 

ji € J such that Oi = and pi L qj^ . By the induction hypothesis, pi S: qj^ can 
be derived from A1-4+2S. Hence, ~ Xie/ ^iQji can be proven from 

A1-4+2S. 

Vice versa, since q f p, for each I e J there is an b G / such that bi = 
1 2 

Oi, = bj^^ and qi L p^ L qj^^ . By completeness of A1-4+S for the simulation 
preorder, biqi S; a^qj^^ can be derived from A1-4+S. So a^qj^^ < O'iiQjij + biqi 
can be derived using 2S. Hence, ~ can be proven from 

A1-4+2S. As the index set {ji, | Z G J} is included in the set {ji | t G /}, we can 
derive from A1-4+2S that 

iGl iGl IGJ iGl jGJ jGJ 

By transitivity we conclude that p ~ q can be derived from A1-4+2S. □ 

The aforementioned proof system for the 2-nested simulation preorder, albeit 
very natural, includes the conditional axiom 2S; moreover, the condition of this 
axiom contains an auxiliary relation symbol that is not defined inductively on the 
syntax of BCCS. This raises the question of whether there exists a finite purely 
(in)equational axiomatization of 2-nested simulation preorder and/or equiva- 
lence at least over the language BCCS. The remainder of this study is devoted 
to showing that no finite (in)equational axiomatization of 2-nested simulation 
exists over BCCS. 

5 Inaxiomatizability of the 2-Nested Simulation Preorder 

In this section we prove that the 2-nested simulation preorder is not finitely 
inequationally axiomatizable. The following lemma will play a key role in the 
proof of this statement. In the lemma, and in the remainder of this paper, we 
let a° denote 0, and denote a{a'^). 

Lemma 3. Ifp -b a'^, then either p or p ^ 

The idea behind the proof that the 2-nested simulation preorder is not finitely 
inequationally axiomatizable is as follows. Assume a finite inequational axioma- 
tization E for BCCS that is sound modulo £ . We show that, if m is sufficiently 
large, then, for all closed inequational derivations f p\ f f pk from E 
with pfe -b a™, we have that pk So -b a™ cannot be 

derived from E. Note that -I- a™. 

2 

Lemma 4. Let t ^ u be sound modulo £ . Let m he greater than the depth of 
t. Assume that C[a{u)] £^ -I- a™. Then C[cr(t)] implies C[a{u)] 
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Proof. Let C[a{t)] we prove C[a{u)] Since C[a{u)] ^ + 

a™, it is sufficient to show that + o’” C[a{u)]. In fact, if C[a{u)] 

o^’” + o’” and o^’” + o’” C[(t(m)], by Lemma 3 it follows that C[a{u)] a?'^, 

which is to be shown. We prove that o^’”+o’” C[(t(m)] holds by distinguishing 

two cases, depending on the form of the context C[]. 

— Case 1: Suppose C[] is of the form C'[6([] + r)]. 

Consider a transition C[a{u)] q'. Since C[] is of the form C'[6([] + r)], 

clearly there is a transition C[(j{f)\ p' where p' can be obtained by re- 
placing at most one subterm a(u) of q' by Since ait) C ct(m), by 

Lemma 2(2), <j{t) and <j{u) have the same depth; so p' and q' have the same 
depth as well. Since C[a{t)] , it follows that p' So by 

Lemma 2(2), depth{p') = depth{q') = 2m — 1. As depih{a^~^) yf 2m — 1, 
by Lemma 2(2) q' . This holds for all transitions C[(t(m)] q' , and 

(j 2 m + A, a^-^, so a^’” -k o’” C[cr{u)]. 

— Case 2: Suppose C[] is of the form [] + r. 

As pit) L p{u) for all closed substitutions p, by Lemma 2(2) p{t) and p{u) 
have the same depth for all p. Clearly this implies that depth{t) = depth{u), 
and moreover that t and u contain exactly the same variables. 

Since aif)+r by Lemma 2(3) norm{(T{t)) > 2m and normir) > 2m. 

As a{u) -k r C + o’”, again by Lemma 2(3), we have that norm{a{u)) > 
m. 

Since depth{f) < m and norm{a{t)) > 2m, for each variable x G var{t) = 
var{u) we have norm{a{x)) > m. 

By the fact that depth{u) = depth{t) < m and norm{a{u)) > m, each 
termination trace of ct(m) must become, after less than m transitions, a 
termination trace of a ct(cc) with x G var{u). Since for all x G var{u) = var{f) 
we have norm{a{x)) > m, it follows that norm{(r{u)) > m. Since moreover 
normir) > 2m, we have normiaiu) + r) > m. As + o’” has norm m, by 
Lemma 2(3) we may conclude that -k o’” ct(m) -k r. □ 

Remark 1. The inequation ax "2 ax + a^ is sound modulo However, 
a'^ + a} . So the side condition in the statement of Lemma 4 that C[(t(m)] 

-k o’” cannot be omitted. (Note that a‘^ + a^ -k of.) 

Theorem 3. BCCS modulo the 2-nested simulation preorder is not finitely in- 
equationally axiomatizable. 

Proof. Let A be a finite, non-empty inequational axiomatization for BCCS that 
is sound modulo C . Let m > ma,x{ depth it) | t ^ m G E}. 

By Lemma 4, and using induction on the length of derivations, it follows that 
if the closed inequation < r can be derived from E and r + o’”, then 

r ^ 2 m ^ 2 m (Lemma 2(3)), it follows that a^’” < a^’” -k o’” 

cannot be derived from E. Since a?"^ a?'^ + o’”, we may conclude that E is 

not complete modulo C . □ 
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6 Inaxiomatizability of 2-Nested Simulation Equivalence 

We now proceed to prove that the 2-nested simulation equivalence is not finitely 
equationally axiomatizable. The following lemma will play a key role in the proof 
of this statement. 

2 

Lemma 5. Let the inequational axiom u ^ t be sound modulo S . If t is of the 
form form J2k&K Vk + then 

~ {Vk I k G K} C {xi I i G I}, and 

— for each £ G L there is a j G J such that varftj) C var{ut). 

Proof. Let m be greater than the depth of u. 

Assume, towards a contradiction, that yk ^ {xi \ i G 1} for some k G K. 
Let a{yk) = a™‘ and let a{z) = 0 for z yf yk- As aijjk) a™‘~^ , it follows 
that <j{u) so <t{u) has a termination trace of length m. On the other 

hand, cr{xi) 0 for t G I, and it is easy to see that no a{ajtj) for j G J has a 
termination trace of length m; so a{t) does not have a termination trace of length 
m. As a{u) L a{t) by the soundness of u f t, this contradicts Lemma 2(1). 

Assume, towards a contradiction, that there is an £ G L such that varftj) % 
var(ue) for all j G J. Let p{z) = 0 for z G var{ui) and let p{z) = a™ for 
z ^ var(ue). Since p(z) = 0 for z G var{ui), clearly depth{p{u()) < depth{u) — l < 
m— 1. On the other hand, for all transitions p{t) p' we have depth{p') > m—1. 
Namely, each transition of pft) is of the form pit) or pff) ^ p{tj)', 

by assumption, for every j G J, the term tj contains a variable z ^ var(ui), 

implying that depthip{tj)) > m. Since p{u) p{t) and p{u) p{ue), it follows 

that there is a transition pfl) q' with p{ui) q' . Since depth{piue)) < m — 1 
and depth(q') > m — 1, this contradicts Lemma 2(2). □ 

Assume a finite equational axiomatization E for BCCS that is sound modulo 
The idea behind the proof that E cannot be complete modulo is as 
follows. We show that, if m is sufficiently large, then, for all closed derivations 
a{af^ + a™) m p^ pz ■■■ pc p^. from E, pk p'f. implies norm{p'ff) = m. Clearly, 
0 ( 0 ^™ + o’”) + does not satisfy the latter property, so 0 ( 0 ^™ + o’”) « 

0 ( 0 ^™ + a™) + cannot be derived from E. Note that a(a^™ + o’”) 

a(a2™ + a™) + a2™+i. 

Theorem 4. BCCS modulo 2-nested simulation equivalence is not finitely equa- 
tionally axiomatizable. 

Proof. Let if be a finite, non-empty equational axiomatization for BCCS that 
is sound modulo Let m > ma,x{depth{t) \t^uG E}. 

First we prove the following fact: 

Claim: Let t k, u G E and let cr be a closed substitution such that C[cr(t)] only 
has termination traces of lengths to -I- 1 and 2 to -I- 1 . Suppose moreover that for 
every transition C[cr(t)] p' we have normip') = to. Then, for every transition 
C[<j{u)\ dL q' we have norm{q') = to. 
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Proof of the claim. First of all, note that, as C[a{t)\ C[a{u)], by Lemma 2(1) 

we know that C[a{u)] only has termination traces of lengths m + 1 and 2m + 1. 
We now proceed with the proof by distinguishing two cases, depending on the 
form of the context C[]. 

— Case 1: Suppose C[] is of the form C"[d([] + r)]. 

Consider a transition C[a{u)] q' . Since C[] is of the form C'[d([] +r)], 
clearly there is a transition C[a{tf\ p' where p' can be obtained by re- 
placing at most one subterm a{u) of q' by aft). Since aft) a fa), by 
Lemma 2(3) aft) and afa) have the same norm; so p' and q' have the same 
norm as well. By assumption normfp') = m, so norm(q') = to. 

— Case 2: Suppose C[] is of the form [] -|- r. 

Let t be of the form + j t>e of the form 2/fe + 

Consider a transition afu)+r q' . We distinguish three possible 

cases. 

— Case 2.1: Let r q' . Then aft) + r ^ q' , which implies norm{q') = to. 

— Case 2.2: Let afyu) q' for some k & K. By Lemma 5, pk = Xi for some 
i G I, so a{xi) q' . Then a{t) + r q' , which implies norm(q') = to. 

— Case 2.3: Let q' = a{ue) for some £ G L. By Lemma 5, varftj) C varfut) 
for some j G J. Since depthft) < to, we have depthftj) < to. On the other 
hand, aft) + r -3 aftj) implies norm{a{tj)) = to. Hence, each termination 
trace of a{tj) (so in particular its shortest one) must become, after less 
than TO transitions, a termination trace of a cr(a;) with x G varftj). So 
norm{aftj)) = to implies norm{a{x)) < to for some x G varftj). Since 
X G var(ue) and depth(ue) < to, we have norm{a{ue)) < 2m. Since afa) 
only has termination traces of lengths to -I- 1 and 2m + 1, and moreover 
a{u) afai), it follows that afai) can only have termination traces of 
lengths TO and 2m. Hence, norm{afai)) = to. {End of the proof of the claim) 

Suppose now that p only has termination traces of lengths to -I- 1 and 2m + 1. 
Suppose moreover that for every transition p p' we have normfp') = to. By 
induction on the length of equational derivations from E, using the claim that 
we have just proven, it is easy to show that lip ~ q can be derived from E, then 
for every transition q q' we have norm(q') = to. 

Concluding, 0 ( 0 ^™ -I- a™) only has termination traces of lengths to -I- 1 and 
2 to-|- 1. Moreover, its only transition is 0 ( 0 ^™ -I- a™) -I- a™, and -I- a™ 

has norm to. Finally, 0 ( 0 ^™ + a"') and does not have norm 

TO. So 0 ( 0 ^™ -I- a™) « a(a^™ -I- a™) -I- cannot be derived from E. Since 

0 ( 0 ^™ -I- o’”) 0 ( 0 ^*” -I- o’”) -I- 0 ^’”+^, we may conclude that E is not complete 

modulo □ 

7 The 3-Nested Simulation Preorder and Beyond 

Groote and Vaandrager [10] actually introduced a hierarchy of n-nested simula- 
tion preorders for n >2. The following definition generalizes Definition 2. 
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Definition 6. For n > 1, p q iff p R q with R a simulation and R ^ 

included in The kernel of is denoted by 

It is easy to see that £ is included in for n > 1. The characterization of 
the 2-nested simulation preorder in Theorem 1 generalizes to the n-nested simu- 
lation preorders for n > 3. Also, the idea behind the conditional axiomatization 
for the 2-nested preorder (see Theorem 2) generalizes to the n-nested simulation 
preorders for n > 3. The proofs of these results are omitted. 



Theorem 5. For n > 1, and for closed process terms p and q over BCCS, 

^n+l 

P b q iff 



(1) for all p p' there is a q q' with p' c”+^ 

(2) q p. 



q' , and 



Definition 7. For n > 1, let fn+i be the preorder generated by the equational 
axioms Al~4 together with y x ^ x ~n+i x + y. 



Theorem 6. For n > 1, and for closed process terms p and q over BCCS, 

P'Snq iffp £" q- 

It follows from the proof of Theorem 4 that there does not exist a finite inequa- 
tional axiomatization for the 3-nested simulation preorder. 

Theorem 7. BCCS modulo the 3-nested simulation preorder is not finitely in- 
equationally axiomatizable. 

Proof. Let if be a finite inequational axiomatization for BCCS that is sound 
modulo £^. Since is included in clearly the equational axiomatization 
E' = {t ^ u \ t S u G E} is sound modulo Let m > ma,x{depth{t) | t « n G 
E'}. In the proof of Theorem 4 it was shown that 0 ( 0 ^™ -I- a™) « a{af'^ -\-a^)-\- 
^2771-1-1 derived from E' . Hence, a(a^’" -I- a™) < a{af'^ -\- a™) -I- 

cannot be derived from E. Since 0 ( 0 ^™ -I- a™) a{af''^ a^) , it follows 

that E is not complete modulo £ . □ 

We leave it as an open question whether there exist finite equational axioma- 
tizations for n-nested simulation equivalence if n > 3, and finite inequational 
axiomatizations for the n-nested simulation preorder if n > 4. 
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Abstract. In this paper we separate many-one reducibility from truth- 
table reducibility for distributional problems in DistAfR under the hy- 
pothesis that V yf NV ■ As a first example we consider the 3-Satisfiability 
problem (3SAT) with two different distributions on 3CNF formulas. We 
show that 3SAT using a version of the standard distribution is truth-table 
reducible but not many-one reducible to 3SAT using a less redundant 
distribution unless V = NV ■ 

We extend this separation result and dehne a distributional complexity 
class C with the following properties: 

(1) C is a subclass of DistAfR, this relation is proper unless V = NV. 

(2) C contains DistR, but it is not contained in AveV unless DistAf'P C 
AveZW. 

(3) C has a <^-complete set. 

(4) C has a <tt-complete set that is not <^-complete unless V = AfR. 
This shows that under the assumption that V yf AfR, the two complete- 
ness notions differ on some non-trivial subclass of DistAfR. 



1 Introduction 

Since the discovery of A/’P-complete problems by Cook and Levin [Coo71,Lev73], 
a considerable number of A/”7^-complete problems have been reported from var- 
ious areas in computer science. It is quite interesting and even surprising that 
most of these AfT^-completeness results, except only few cases [VV83], have 
been proven by showing a polynomial-time many-one reduction from some other 
known AfP-complete problems. Recall that there are various reducibility types 
(among polynomial-time deterministic reducibilities) and that polynomial-time 
many-one reducibility is of the most restrictive type. For example, polynomial- 
time truth-table reducibility is, by definition, more general than polynomial- 
time many-one reducibility, and in fact, it has been shown [LLS75] that these 
two reducibilities differ on some problem. Nevertheless, no AfT^-complete prob- 
lem is known that requires (even seems to require) polynomial-time truth-table 
reducibility for proving its AfT^-completeness. 

* Supported in part by JSPS/NSF cooperative research: Complexity Theory for 
Strategic Goals, 1998-2001. 



A. Ferreira and H. Reichel (Eds.): STAGS 2001, LNCS 2010, pp. 51-62, 2001. 
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Many researchers have studied the difference between these polynomial-time 
reducibility types; see, e.g., [LY90,Hom97]. Notice first that showing the differ- 
ence between many-one and more stronger reducibilities on J\fV implies that V 
^ MV (because if 7^ = MV, then any nontrivial set in MV is A/’T^-complete 
under many-one reducibility). Thus, it is more reasonable to assume (at least) V 
yf NV and to ask about the difference between, e.g., many-one and truth-table 
reducibilities on MV under this assumption. Unfortunately, however, the ques- 
tion is still open even assuming that V yf NV. Maybe the difference is too subtle 
to see it in MV by only assuming V yf NV . In this paper we show that this 
subtle difference appears when we use reducibility for analyzing distributional 
NV problems. 

The notion of “distributional problem” has been introduced by Levin [Lev86] 
in his framework for studying average-case complexity of MV problems. A dis- 
tributional problem is a pair (A,/i) of a decision problem A (as usual, A is a 
set of positive instances of the problem) and an input distribution /x. Intuitively 
(see below for the formal definition), by the complexity of (A, /x), we mean the 
complexity of A when inputs are given under the distribution /x. Analog to the 
class NV, Levin proposed to study a class DistA/”?^, the class of all distributional 
problems (A, /x) such that A S NV and /x can be computed in polynomial-time. 
Also he introduced a class AveV, the class of distributional problems solvable in 
polynomial-time on average. Then the question analog to the V versus MV ques- 
tion is whether DistA/”?^ C AveV . Levin also extended the notion of reducibility 
for distributional problems, and somewhat surprisingly, he proved that distri- 
butional problem (BH,^st), where BH is a canonical A/’P-complete set and yxst 
is a standard uniform distribution, is complete in DistA/”?^ by using many-one 
reducibility. (See, e.g., [Gur91,Wang97] for detail explanation and basic results 
on Levin’s average-case complexity theory.) 

Unlike the worst-case complexity, only a small number of “natural” distribu- 
tional problems have been shown as complete for DistA/”?^. Intuitively, it seems 
that most MV problems are not hard enough to become complete under natural 
distributions. More technically, the condition required for the reducibility (in the 
average-case framework) is strong, it is affected by even some small change of dis- 
tribution. Aida and Tsukiji [ATOO] pointed out that this sensitivity could be used 
to show the subtle difference between many-one and more general reducibilities. 
They showed two problems (A, /xa) and (B,/x_b) in DistA/”?^ such that (A, /xa) 
(B,/x_b) but (A, /xa) (^)Mb) unless V = MV. Unfortunately, though, 
these distributions fiA and /xb are so small that these two problems are trivially 
in AveV. In fact, e.g. any problem in SXV is in AveV for some artificial dis- 
tribution and hence any separation result in some larger class still holds within 
AveV ■ It has been left open to show such difference on nontrivial distributional 
NV problems. 

We solve this open question in this paper. We separate many-one reducibil- 
ity from truth-table reducibility for nontrivial problems in DistA/”'P under the 
hypothesis that V yf MV. Furthermore, we show some nontrivial subclass of 
DistAfV in which many-one and truth-table completeness notions differ unless 
V=AfV. 
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First we define two versions of the distributional 3-Satisfiability problem 
(3SAT) by considering different distributions on 3CNF formulas. The first dis- 
tribution fj, is defined by modifying a standard uniform distribution on 3CNF 
formulas. Here the standard distribution gives each formula the probability that 
it is generated by a random process, where every literal is chosen randomly from 
the set of variables and their complements. For the second distribution v, we 
consider less redundant 3CNF representation. Note that a 3CNF formula F usu- 
ally has many trivially equivalent formulas; for example, permuting the order of 
clauses in F, we can easily get a different but equivalent formula. We consider 
some restriction on the form of formulas to reduce this redundancy, and define 
the second distribution v so that non-zero probability is given only on such for- 
mulas that satisfy our restriction. By this way, the probability of each formula 
(of the required form) gets increased considerably (compared with i'). By using 
this increase, we prove that (3SAT,:^) is not many-one reducible to (3SAT,/i) 
unless V = AfV. On the other hand, by using the self-reducibility of 3SAT, we 
prove that even (3SAT, i/) is truth-table reducible (3SAT,/x). 

Next we extend this separation technique and define a subclass C of DistAf?^ 
in which many-one and truth-table completeness notions differ unless V = AfV . 
Furthermore, we can show that C is not contained in AveV (thus it is not trivial) 
unless all DistAf?^ are solvable in polynomial-time on average by randomized 
zero-error computation. 

2 Preliminaries 

We use standard notations and definitions from computability theory, see, e.g., 
[BDG88]. We briefly recall the definitions of the average-case complexity classes 
used in the following. For definitions and discussion, see [Gur91]. 

A distributional problem consists of a set L and a distribution on strings 
defined by the distribution function )i, i.e., a (real) valued function such that 
= 1- A distribution /i is called polynomial-time computable if the binary 
expansion of the distribution function /r*, defined by ^^r all 

X, is polynomial-time computable in the sense that for any x and n, the first n 
bits of n*{x) is computable within polynomial time w.r.t. \x\ and n. 

Let DistAfP denote the class of all distributional problems (L, p) such that 
L G J\fV and p is polynomial-time computable. Similarly, DistT^ denotes the 
class of distributional problems (L,p) G DistAf?^ such that L is in 7^. 

The average-case analog of V is denoted by AveV and defined as follows. A 
distributional problem (L, p) is decidable in polynomial-time on average, if L 
is decidable by some t-time bounded Turning machine, and t is polynomial on 
p-average, which means that t, a function from S* —>■ N, satisfies the following 
for some constant e > 0 [Lev86,Gur91]. 

E F{x) , , 

^ < ». 

Let AveV denote the class of all distributional problems that are decidable in 
polynomial-time on average. Similarly let AveZVV denote the class of all dis- 
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tributional problems that are decidable in polynomial-time on average by ran- 
domized Turing machines (without error), see, e.g., [Imp95]. Here we have to 
be a little careful defining average polynomial-time for randomized computation 
[Gur91]. Let t(x, r) denote the running time of M on input x using random bits 
r. We say that M is polynomial-time on average if 



V V2-I’'! 



t^{x, r) 



fi{x) 



< oo. 



where r ranges over all binary strings such that M on input x halts using r but 
it does not halt using any prefix r' of r. 

Finally, we define “reducibility” between distributional problems. A distri- 
butional problem (A,/i) is polynomial-time reducible to if there exists 

an oracle Turing machine M and a polynomial p such that the following three 
conditions hold. 



(1) The running time of M (with oracle B) is polynomially bounded. 

(2) For every x, we have x G A x G L{M,B), where L{M,B) is the set of 
strings accepted by M with oracle B. 

(3) For any x, let Q{M, B,x) denote the set of oracle queries made by M with 
oracle B and input x. The following condition holds for every y. 



I'iy) > 



E 

X : yGQ(M,B,x) 



p{\x\)' 



From these three conditions, any problem (A, p) that is polynomial-time re- 
ducible to some problem in AveV also belongs to AveV [Lev86,Gur91]. The 
above condition (3) is called a dominance condition. 

By restricting the type of queries, we can define finer reducibilities. A reduc- 
tion M is called a truth-table reduction if for every x, the oracle queries of M on 
input X are made non-adaptively, i.e., they are independent of the oracle set. M 
is a many-one reduction if for every x, M on input x makes exactly one query, 
and it accepts x iff the query is in the oracle set. We can define more general 
reduction types by considering randomized computation. That is, a reduction 
is called a randomized reduction if the oracle Turing machine is randomized. 
In this paper, we consider the most restrictive randomized reduction type that 
requires “zero error” to the oracle Turing machine M, i.e., M is correct and 
polynomial-time bounded for all inputs and all possible random bits. The dom- 
inance condition needs to be revised for randomized reductions. For any x and 
any r, let Q{M, B, x, r) denote the set of oracle queries made by M^{x] r), i.e., 
the execution of M with oracle B on input x using random bits r. Here we 
assume that M^{x; r) halts consuming all bits of r and M^{x; r') does not halt 
for any prefix r' of r. (If r does not satisfy this condition, then we simply define 
Q{M, B,x,r) to be empty.) Then our dominance condition is stated as follows. 



(3’) For every y, we have 



Hy) > 



E 

x,r : yGQ{M,B,x,r) 



p{x) ■ 2~l”l 
p{\x\) 
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3 Separation on 3SAT 



Our first separation is on 3SAT, i.e., the set of all satisfiable 3CNF formulas 
F. We recall some basic definitions on 3SAT. A formula F is in 3CNF if F is 
a conjunction of clauses which contain at most 3 literals, i.e., F is of the form 
Cl A C 2 A • • • A Cm, where Ci = Ij^ V Ij^ V /jj and Ij^ is either the variable Vj,, 
or its negation. (We use the index of jk of each literal to denote that of its 
variable.) We use to denote the set of 3CNF formulas with n variables 

and m clauses. (We assume that m < 8n^.) 

The standard distribution Ust assigns to any formula F in prob- 

ability 

^ . _j_2-3m(l-|-[logn]) 

n(n -I- 1) 8n3 

That is, we have the following random experiment in mind. 

Choose n (number of variables) randomly. Choose m S {1, • • • , 8n^} (number 
of clauses) randomly. Choose each of the 3m literals I randomly from the set of 
variables and negated variables of size 2n. Let F denote the resulting formula. 
Output F. 

In order to simplify our discussion, we restrict the form of formulas so that 
m = fo{n), where fo{n) = [nlogn] . Since m is determined from n, the standard 
distribution is modified as follows. 



Mst/o i^) 



8n^ ■ ^J,st{F), if C G JF("-/o(n))^ and 
0, otherwise. 



We should note here that the same result holds by considering any “smooth” 
function for / such that n < f{n) < nlogn for all n. Here a function / is 
smooth if there is no big jump from f(n — 1) to /(n); more precisely, there 
exists constants c/ > 1 and df > 0 such that for any sufficiently large n and 
for some k < d/logn, we have /(n) — c/logn < f(n — k) < f{n) — logn. For 
example, consider /(n) = nflogn] . While this function satisfies our smoothness 
condition for most n, we have f{n) > f{n — k) + log n for any k = 0(log n) if n is 
sufficiently large and [logn] =1-1- [log(n — 1)]. On the other hand, a function 
like /(n) = [nlogn] satisfies this smoothness condition for k =1 and c/ = 2. 

Note that it is still open whether (3SAT, /Xstyo) is in AveV, i.e., polynomial- 
time solvable on average. (Though using most formulas are in fact unsatis- 
fiable, and standard algorithms perform well an average [KiSe94]). On the other 
hand, it has been shown that (3SAT, y^st/) defined using /(n) > for some 
d > 0 is indeed in AveV [KP92] . 

Now define the first distribution. ^From some technical reason, we consider 
3CNF formulas with some additional clauses. For any n > 0, let d(n) = /o(n) — 
/o(n — 1) (where /o(0) = 0). A 3CNF P = C\ f\ ■ ■ ■ Cd(n) is called a type-I prefix 
for n if each Ci is of the form Ci = {vj^Vvj^ VwjJ for some ji G {3i — 2, 3i— 1, 3i}. 
Note that there are type-I prefixes for n. We consider only formulas G 

that are of the form PAF for some type-I prefix P for n and F G pCJoC-i)) ^ 
use to denote the set of such formulas. This somewhat artificial requirement 
is just to simplify our analysis of a truth-table reduction defined in Lemma 3. 
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Our first distribution is defined as follows. 



KG) 



/ n(n+l) 

10 , 



3 



d(n) 



2-3/o(n-l)(l+riog«l)^ jf Q jg g{n)^ 

otherwise. 



Next we define the second distribution. As mentioned in the Introduction, the 
3CNF representation has redundancy; i.e., a 3CNF formula (usually) has many 
trivially equivalent formulas. Here we introduce one restriction on the form of 
formulas for reducing some redundancy, which is not essential for the hardness 
of the satisfiability problem. 

For any n, a 3CNF P = Ci A • • • A Ca(n) is called the type-II prefix for n if 
each Ci is of the form Ci = {v3i-2 V v^i-i Vv^i). Note that for each n, the type-II 
prefix for n is uniquely determined. We consider only formulas F in 
such that the first d{n) clauses of F are the type-II prefix for n. Let denote 
the set of such formulas. Note that pl") and are subsets of p("do("))^ As 
shown in the next Lemma, the restriction to formulas of type is not essential 
for the hardness of the satisfiability problem. 

Lemma 1. For any 3 CNF formula F G p("do("))^ yjg either convert it to 
an equivalent formula F' G p(") (by (i) reordering clauses and (ii) renaming 
and/or changing the signs of variables) or determine the satisfiability of F in 
polynomial- time. 



Now our distribution is defined as follows. 



KF) 



^ 1 -2-3(/o(")-<i("))(l+riog"l) if P C P(”) 

n(n+l) ’ ’ 

0, otherwise. 



Intuitively, n corresponds to the following random generation. 

Choose n (number of variables) randomly. Fix first d{n) clauses as required 
for the type-II prefix. Then choose the remaining fo{n) — d{n) clauses as in the 
standard distribution. Output F . 

We observe that the distributions p, and v defined above are polynomial time 
computable. Thus, both distributional problems (3SAT, p) and (3SAT, v) belong 
to DistAfP. 

For our separation result, we first show that (3SAT, v) is not to (3SAT, p) 
unless V = AfV. 



Lemma 2. If (3SAT,:^) (3SAT,/x), then we have 3SAT G V and hence 

V=NV. 

Proof. Assume there exists a many-one reduction R from (3SAT, p) to (3SAT, p). 
Consider the 3SAT solver defined in the Figure 1. 

The correctness is clear by the definition of the many-one reducibility. The 
polynomial-time bound of this algorithm is guaranteed as follows. The reduction 
R reduces (in each iteration) a formula of F in p("To(")) formula F' in 
jr(n -i,/o(n - 1 )) That is, the number of variables is reduced by at 

least one in each while-iteration. 
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Algorithm 3SAT Solver 
input F in 
F' ^ F; n' ^ n; 
while n' > logn do 

modify F' to an equivalent formula F in 
% The procedure mentioned in Lemma 1 is used. 

% If this fails, then the satisfiability of F' can be determined directly. 

G ^ R{F); n' <— the number of variables iu G; 

% G is in i.e., 

% G = P A F' with some type-I prefix P for n' and F' £ JoG -i)) 
remove each clause {vk^ V Vk^ V Vk^) of P by assinging Vki = 1 in T'; 

% F' may be reduced to a simpler formula. 

(if uecessary) add redundant variables or clauses so that F' £ -iJoG -i)) 

end-while 

output 1 if the final F' is satisfiable, and output 0 otherwise; 

end-algorithm. 



Fig. 1. SAT Solver 



This claim is proved by using the dominance condition. ^From the dominance 
condition, for some constant c > 0 and for any sufficiently large n, we have 

^ 2~3(.fo(")-rf(w))(l-triog n]) 

n{n + 1) 

= v{F) < rF-^i,{F')= 

Since d{n) > [logn], this implies 

^ 2~3(.fti(")-riogn1)(l-erioRKl) ^ [[ 2“3/o(n'-l)(H-riogn']) 

n(n-l-l) “ (n')^(n' -I- 1) 

Now suppose that n' > n. Then from the above, it should hold that clogn > 
3 log^ n, which is impossible for sufficiently large n. Therefore, we have n' < n. 

On the other hand, some <tt-reduction exists from (3SAT, n) to (3SAT,/x). 

Lemma 3. (3SAT,n) <[[ (3SAT,/x). 

Proof. We define a truth-table reduction from (3SAT, n) to 3SAT,/i). For our 
discussion, consider any formula F in F^^'> . Recall that F = Ci A • • • A A E, 
where each Cj, 1 < i < 2 [log n] , is of the form (^3^-2 V vsi-i Vv^i). We would like 
to solve the satisfiability of F by asking polynomially many non-adaptive queries 
to 3SAT. Note that all queried formulas have to be of some appropriate form, 
more precisely, they should belong to ^ for some n'. Furthermore, since n(F) 
(for F G is much bigger than /x(G) (for G G we cannot increase 

the size of queried formulas. Our idea is simple. We delete the first d{n) clauses 
Cl, by considering all possible partial assignments satisfying all these 

clauses. Since each Ci is (^3^-2 V v^i-i V vsi), we only have to assign 1 to one 
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of three variables v^i- 2 -,v^i-i,v^i for satisfying Ci. That is, for every partial 
assignment, which assigns 1 to one of three variables t' 3 i_ 2 , wsi-i, wsi for each i, 
1 < i < d{n), we can substitute the first d{n) clauses by a type-I prefix for n. 
The resulting formula G is in (i.e., has (at most) n variables and consists 
of a type-I prefix for n followed by /o(n) — d{n) = fo{n— 1) clauses). Note that 
there are < n'^ such partial assignments and that F is satisfiable if and only 
if one of the obtained formula G is satisfiable. Therefore, the above procedure 
is indeed a disjunctive truth-table reduction that asks a polynomial number of 
formulas (of the same size). 

The dominance condition, is satisfied since (i) v{F) < rF ■ fJ,{G) and (ii) any 
query formula G is asked for only one formula F. The condition (ii) is satisfied 
since the type-II prefix of F is unique, and G is identical to F on all other clauses. 

The fact that iy{F) < rf ■ /r(G) for some c > 0. is immediate by comparing 
v{F) and /x(G) as follows. 



v{F) 



2-3(fa(n)-d(n))(l+loi!:n) _ 

n{n + 1) n(n -|- 1) 

3 <i(n) ^ ^ o-d(n) ^ 2-3/o(n-l)(H-log n) ^ 

n(n-l-l) ~ 



2-3/o(n-l)(l+logn) 



• fi{G) 



From above two lemmas, we have the following separation result. 



Theorem 1. There exist polynomial time computable distributions v and pi such 
that (3SAT,:^) <() (3SAT,/x), but (3SAT,:^) (3SAT,/i) unless V =AfV. 



4 Separating Completeness Notions 

In this section we define some subclass of DistAfP in which we can show the 
difference between many-one and truth-table completeness notions. More specif- 
ically, we will define a distributional complexity class C with the following prop- 
erties: 

(1) C is a subclass of DistAf?^, and furthermore, the relation is proper unless V 

= NV. 

(2) C contains DistP, but C is not contained in AveV unless DistA/"?^ C AveZVV. 

(3) C has a <((,-complete set. 

(4) There exists a problem C G C that is <f(,-complete in C but that is not 
<((j-complete in C unless V = NV. 

That is, if "P yf JVV, then two completeness notions differ on some subclass of 
DistAfP. Recall that it is not known whether the assumption that (3SAT, pistfo) G 
AveV has some unlikely consequence such as DistAfP C AveZVV above. Hence 
we cannot simply define C as the set of distributional problems that are many 
one reducible to (3SAT, /Xstyo)- 

First we define the complexity class C. For this purpose, we consider the 
following version of bounded halting problem, which we call Rounded Halting 
problem with Padding. Here for some technical reason, we consider only Turing 
machines M using one tape as both an input and a work tape. We also assume 
that M’s tape alphabet is {0,1,B} and that M cannot go beyond the cells 
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containing 0 or 1 . Note that this is not an essential restriction if we assume that 
M’s reachable tape cells are initially filled by 0. On the other hand, with this 
assumption, we can represent the content of the whole tape of M by a string in 
{0, 1}* of fixed length. 

Below we use </> to denote any fixed function on N such that n < 4>{n) < p(n) 
for some polynomial p and 4>{n) is computable within polynomial-time in n. 

= {{M,q,i,w,y) : 

(i) M is NDTM, <7 is a state, i, 1 < i < licl, is a head position, and 
w,y G {0, 1}*, where w is M’s tape and y is padding, 

(ii) |y| = </!)(|M| -|- \w\ + 1) for some t G N, and 

(iii) M has an accepting path of length t from configuration {q, i, ru).} 

Notice here that w represents the content of the whole M’s tape. We assume that 
M’s tape head does not go outside of w. We assume some reasonable encoding 
of M and its state q, and \M\ and |q| are the length of the descriptions of M and 
q under this encoding. Again for simplifying our discussion below, we assume 
that for each M and w, the length of [g] and |i| is fixed. 

In the literature, the following versions of the halting problem BH and its 
padded version BH' have been studied [Gur91]. Our BHP^ is regarded a variation 
of of BH' when </> is defined as </>(n) = n. 

BH = { (M, cc,0*) : M accepts x in t steps. }, and 
BH' = { {M,x,y) : M accepts x in |y| steps. }. 



As a distribution we consider the standard distribution extended on tuples, 
e.g., every instance {M,q,i,x,y) of BHP^ has the following probability. 



fj.st{{M,q,i,w,y)) 



a(|M|,M,|*|,H,|j/|) 



where a(ni, ri 2 , . . . , Uk) = II^=i + !)• Note however that a unary padding 
string has probability inverse polynomial to its length; for example, for any 
instance (M,x,0‘) for BH, we have /Xst((M,a;,0*)) = a(|MUx|,t) ' 

First it should be mentioned that (BH, 7 ist) is reducible to (BHP 0 ,/Xst) via a 
randomized reduction of the strongest type, i.e., the one with no error. 



Proposition 1. For any polynomially bounded </)(n) that is polynomial-time 
computable w.r.t. n, there is a polynomial-time randomized reduction (with no 
error) from (BH, 7 ist) to (BHP 0 ,/Xst). 

Since (BH, 7 igt) a complete problem in DistAfP [Gur91], this proposition 
shows that (BHP 0 , 7 igt) is complete in DistAfP under the zero-error randomized 
reducibility. On the other hand, since (BHP 0 ,/Xst) is a distributional problem 
with a flat distribution, as we will see below, (BH, 7 igt) is not <^-reducible to 
(BHP 0 ,/Xst) unless V = AfV. 

We may use any reasonable function for (f>. Here for the following discussion, 
we fix </>(n) = nlogn, by which we formally mean that 4>{n) = [nlogn] (see the 
smoothness discussion in the previous section). Let BHP denote the class BHP^ 
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with this 4>. Now our class C is defined as a class of distributional problems {L, /i) 
such that (i) /i is polynomial-time computable, and (ii) (L, /i) is <^-reducible 

to (BHP,^st). 

Note first that if {L, fj,) is <f^-reducible to (BHP, /Xst), then L must be in NV. 
Thus, C is contained in DistAfT^. But (BH, /Xst) is not <^-reducible to (BHP, /Xst) 
unless V = NV. Thus, iiV ^ NV, then C is a proper subclass of DistAf?^ because 
(BH, Ust) does not belong to C. On the other hand, since (BHP, fist) is complete 
in DistAf?^ under the zero-error randomized reducibility, it cannot be in AveV 
unless DistAf?^ C AveZVV; that is, C ^ AveV unless DistA/”?^ C AveZVV. 

Proposition 2. The class C defined above has the following complexity. 

(1) It is a subclass of DistAfV , and the relation is proper unless V = NV. 

(2) It contains DistP, but is not contained in AveV unless DistAfP C AveZVV. 

Clearly, the class C has <g,-complete sets, e.g., (BHP,y^st) is one of them. 
On the other hand, we can define some <f(,-complete problem in C that is not 
<((,-complete unless V = NV. 

Theorem 2. Define BHP' as follows with 4>'(n) = n log n -I- log^ n (or, more 
formally, 4>'{n) = [nlogn -I- log^ n] Then we have (BHP,y^st) <tt (BHP',/igt); 
but (BHP,/ist) (BHP',/ist) unless V = MV. That is, (BHP',/Xst) is <ft~ 
complete in C but it is not <f^-complete unless V = NV. 

BHP' = {{M,q,i,w,u,v) : 

(i) M is NDTM, q is a state, i is a head position, w,u,v G {0, 1}*, 

(a) |u| = t!)'(|M| -I- |w| -I- t — |m|) for some t, 

(Hi) |m| = log(|M| -I- \w\ + t), and 
(iv) starting from configuration (q,i,w), 

M has an accepting path of length t — |u| whose prefix is u. } 

Proof. First we show that (BHP',y^st) is <((j-reducible to (BHP,/Xst)- This im- 
plies that (BHP',y^st) is indeed contained in the class C. Let (M,q,i,w,u,v) be 
any instance of BHP' satisfying the syntactic conditions, i.e., the conditions (i) 
~ (iii), of BHP' for some number t. Let m = \M\ -|- |w| -|- t — |u|. We map this 
instance to {M,q' ,i' ,w' ,y'), where q' , i', w' are respectively M’s state, head 
position, and tape content after executing |u| steps on the path u starting from 
configuration (q,i,w). In order to satisfy the syntactic conditions of BHP (and 
keep the consistency as a reduction), y' should be a string of length <j){m). But 
since 4>{m) = — m (recall that |t(;| = |ic'|), we have |i/'| < |u| — log^(m); 

hence, we can simply use the prefix of v of appropriate length for y' . Notice that 
this mapping may not be one-to-one. But first note that 
Tst{{M,q',i',w',y')) = J2iev{y') 

where V{y') is the set of v of length (j)' (m) whose prefix is y' . Also for considering 
all configurations reachable to {q',i',w'), let C{q',i',w') be the set of pairs of 
M’s configurations (q,i,w) and u of length log(|M| -|- |rt;| -|- t) such that the 
configuration {q' ,i' ,w') is reached after executing |{(| = log(|M| -|- |rt;| -|- t) steps 
from {q,i,w) following u. Since |{(| = log(|M| -|- |'u;| + t) = log(|M| -|- |rt;| -|- t), 
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C(g', i' , w') has at most |M|(|M| -|- |'u;| -|- tY x (|M| -|- |'u;| -|- 1) elements. Thus, we 

ha’X ^ j „•/ / i\\ 

\M\{\M\ + \w\+tr <fist{{M,q,t,w,y)). 

Therefore the dominance condition is satisfied. 



We observe here that the many-one reduction decreases the length of the 
instance by order (log)^. Let £ = \M\ + -|- |i| -|- |t(;| -|- jul -I- |v| and £' = 

1-^1+ k1 + \^'\ + + \y'\, then if £ is sufficiently large, we have £'< £ — 

log^(m) < £-log^(Zi/2) = £-llog'^£, 

since we may assume that > mlogm -I- log^ m + (|M| -|- Iml -|- |< 7 | -|- |i| -|- 1^1) 
= \M\ + |< 7 | -I- |t| -I- |w| -I- |m| -I- |w| = £, for sufficiently large £. 

Next suppose that there is a <^-reduction from (BHP,/Xst) to (BHP',^st)- 
We will show that this assumption implies V = MV. Consider any (M, q, i, w, y) 
satisfying the syntax of BHP, and let (M', q' , i' , w' , u' , v') be the instance of BHP' 
obtained by the assumed reduction. We may assume that {M' ,q' ,i' ,w' ,u' ,v') 
satisfies the syntax of BHP' for some t'; i.e., |u| = 0'(|M'| -|- lic'l + t' — lu'l). 
Let £ = \M\ + |( 7 | -I- |i| -I- licl -I- \y\. By using the reduction from BHP' to BHP 
explained above, we reduce further the instance {M' ,q' ,i' ,w' ,u' ,v') to some 
instance {M' ,q” ,i” ,w” ,y”) of BHP. Note that \y”\ = (j){\M'\ + |r(;"| +t”) where 
t" = t'-\u'\. 

We estimate £' = \M'\ + \q'\ + \i'\ + |u;'| -b \u'\ + |u'| and £" = \M'\ + \q"\ + 
\i"\ + |w"| -b \y"\, and prove that £" < £, i.e., {M' ,q" ,i" ,w" ,y") is shorter than 
(M, q, i, w, y). First from the above analysis, we have £" <£' — \ log^ £' 

Now consider the case that £' < £/2. Then from the above bound, we immediately 
have £'' < £ for sufficiently large £. Thus, consider the other case, i.e., £' > £/2. 
Even in this case, £' cannot be so large. This is because from the dominance 
condition, we have £' < £ + d\og£ for some constant d > 0, and hence, 

£” < {£ + d\og£)-\\og^{£ + d\og£) < {£ + dlog£) - ^log^ £, 
which, by using the assumption that £' > £/2, implies £” < £ £ is large enough. 

Therefore, the obtained instance (M', q” , i” , w” , y") is at least one bit shorter 
than the original instance (M,q,i,w,y). Thus, applying this process for enough 
number of times, which is still polynomially bounded, we can obtain a trivial 
instance for BHP. Thus BHP is in 7^, which implies that V = MV. 

Finally, we show a <^t-reduction from (BHP, /Xst) to (BHP', /ist)- For a given 
instance {M,i,q,w,y) of BHP with |j/| = <()(|M| -b |ru| -b t) for some t, we only 
have to ask queries of the form (M, i, q, w, u, v) for all u G {0, where m 

= \M\ + |rt;| +t, and v is the prefix of y of length 0'(|M| -b |ru| -bt — logm). (We 
will see below that <(>'(|M| -b |w| -b t — logm) is smaller than 0(|M| -b licl -b t); 
hence, this choice of v is possible.) 

Clearly, this reduction works as a disjunctive truth-table reduction from BHP 
to BHP'. To check the dominance condition, consider any (M, i, q, w, u, v) satis- 
fying the syntax of BHP', we estimate the probability of instances in BHP that 
ask (M,i,q,w,u,v) in our <().-reduction. First note that 
|w| = 0'(|M| -b |ru| -b t — logm) 

= (m — log m) log(m — log m) -b (log(m — log m)Y 
< mlogm = (()(|M| -b |ic| -b t) = |i/|. 
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Let I be the set of instances in BHP that ask (M, i, q, w, u, v). Then I consists of 
strings {M, i, q, w, vy') for some y' . Thus, the total probability of instances 

in BHP that ask {M,i,q,w,u,v) is estimated as follows. 






E 






{M,i,q,w,vy')^I 



= X (1/a) • 

= (1/a) • • (1/a') • 



Here a = a(|M|, Ig], |i|, |u;|, |v|, |j/'|) and a' = a(|M|, Ig], |i|, |u;|, 1^1, |?;|). Note 
that 1/a < |Mp/a'. Since (log m)^2^°s’" is bounded by p{\{M,w,u,v)\) with 
some polynomial p, the dominance condition is satisfied. 



References 



[ATOO] S. Aida and T. Tsukiji, On the difference among polynomial-time reducibil- 
ities for distributional problems {Japanese), in Proc. of the LA Symposium, 
Winter, RIMS publication, 2000. 

[BDG88] J. Balcazar, J. Diaz, and J. Gabarro, Struetural Complexity I, EATGS Mono- 
graphs on Theoretical Gomputer Science, Springer- Verlag, 1988. 

[Betal92] S. Ben-David, B. Chor, O. Goldreich, and M. Ludy, On the theory of average 
case complexity, Journal of Comput. and Syst. Sei., 44:193-219, 1992. 

[Coo71] S.A. Gook, The complexity of theorem proving procedures, in the Proc. of 
the third ACM Sympos. on Theory of Comput., ACM, 151-158, 1971. 

[Gur91] Y. Gurevich, Average case completeness. Journal of Comput. and Syst. Sei., 
42:346-398, 1991. 

[Hom97] S. Homer, Structural properties of complete problems for exponential time, 
in Complexity Theory Retrospective 2 (A.L. Selman Ed.), Springer- Verlag, 
135-154, 1997. 

[Imp95] R. Impagliazzo, A personal view of average-case complexity, in Proc. 10th 
Conference Structure in Complexity Theory, IEEE, 134-147, 1995. 

[KiSe94] S. Kirkpatrick and B. Selman, Critical Behauviour in Satisfiablility of Ran- 
dom Boolean Expressions, Science. 264, 1297-1301, 1994. 

[KP92] E. Koutsoupias and C. Papadimitriou, On the greedy algorithm for satisfi- 
ability, Infom. Process. Lett. 43, 53-55, 1992. 

[LLS75] R. Ladner, N. Lynch, and A. Selman, A Comparison of polynomial time 
reducibilities. Theoretical Computer Science, 1:103-123, 1975. 

[Lev73] L.A. Levin, Universal sequential search problem. Problems of Information 
Transmission, 9:265-266, 1973. 

[Lev86] L.A. Levin, Average case completeness classes, SIAM J. Comput., 15:285- 
286, 1986. 

[LY90] L. Longpre and P. Young, Cook reducibility is faster than Karp reducibility, 
J. Comput. Syst. Sci., 41, 389-401, 1990. 

[VV83] U. Vazirani and V. Vazirani, A natural encoding scheme proved probabilistic 
polynomial complete, Theoret. Comput. Sci., 24, 291-300, 1983. 

[Wang97] J. Wang, Average-case computational complexity theory, in Complexity The- 
ory Retrospective 2 (A.L. Selman Ed.), Springer- Verlag, 295-328, 1997. 




Matching Polygonal Curves with Respect to the 

Frechet Distance 



Helmut Alt, Christian Knauer, and Carola Wenk* 

Institut fiir Informatik, Freie Universitat Berlin 
Takustrafie 9, D-14195 Berlin, Germany 
{alt , knauer ,wenk}@inf . fu-berlin. de 



Abstract. We provide the first algorithm for matching two polygonal 
curves P and Q under translations with respect to the Frechet distance. 
If P and Q consist of m and n segments, respectively, the algorithm has 
runtime Cl((mn)®(m + n)^ log(m + n)). We also present an algorithm giv- 
ing an approximate solution as an alternative. To this end, we generalize 
the notion of a reference point and observe that all reference points for 
the Hausdorff distance are also reference points for the Frechet distance. 
Furthermore we give a new reference point that is substantially better 
than all known reference points for the Hausdorff distance. These results 
yield a (1 -I- e)-approximation algorithm for the matching problem that 
has runtime 0{t~^mn). 

Keywords: Computational geometry. Shape matching, Frechet distance. 
Parametric search. Approximation algorithm. Reference point, Steiner 
point. 



1 Introduction 

The task of comparing two two-dimensional shapes arises naturally in many 
applications, e.g., in computer graphics, computer vision and computer aided 
design. Often two-dimensional shapes are given by the planar curves forming 
their boundaries which directly leads to the problem of comparing two planar 
curves. There are several possible distance measures to assess the ‘resemblance’ 
of the shapes, and there are also different kinds of transformations that are 
allowed to match them, see [5] for a survey. We will focus here on the Frechet 
distance S f for polygonal curves, and we will search for a translation which, when 
applied to the first curve, minimizes the Frechet distance to the second one. In 
[4] it is shown how to compute the Frechet distance for two polygonal curves. 

The only algorithm we know of that decides whether there is a transformation 
that, when applied to the first curve, results in a Frechet distance less or equal 
than some given parameter e (this is called the decision problem, see Problem 2 
below) is presented in [10], where the admissible transformations are translations 
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No. AL 253/4-3. 
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in a fixed direction. But to our knowledge there is no algorithm which actually 
computes the Frechet distance under a non-trivial class of transformations^. 

In the following we will adopt some basic definitions and results from [4] on 
which we will subsequently build up. 

Definition 1 (Polygonal Curve) A continuous mapping / : [a, b] — > with 

a,b gM. and a < b is called a curve. A polygonal curve is a curve P: [0, n] ^ R.^ 
with n € N, such that for all i € {0, 1, ...,n — 1} each Pi := PL ,■ i ii is affine, 

,.e„ P(i + A) = (1 - A)P(>) + AP(L 1) /„r .« A £ [0, 1] , 



Definition 2 (Ftechet Distance) Let f : [a, a'] and g : [b, b'] R^ be 

curves. Then Spifjg) denotes t/iezr Frechet distance, defined as 

SF{f,g)-= inf max \\f{a{t)) - g{P{t))\\. 

CKfO,!]— ^[a,a'l tG[0,l] 

where 1 1 . 1 1 denotes the L 2 norm, and a, [3 range over continuous and increasing 
functions with a(0) = a, a(l) = a' , /3(0) = b and /3(1) = b' only. 

As a popular illustration of the Frechet-metric suppose a man is walking 
his dog, he is walking on the one curve the dog on the other. Both are allowed 
to control their speed but are not allowed to go backwards. Then the Frechet 
distance of the curves is the minimal length of a leash that is necessary. 

In the rest of the paper we will develop algorithms for the following two 
problems: 

Problem 1 (di? Optimization Problem) 

Given two polygonal curves P, Q, and a class of transformations T. 

Find a T gT such that 5 f{t{P),Q) is as small as possible. 

Similar to [4] we will first consider the decision problem which we will afterwards 
optimize applying Megiddo’s parametric search technique, c.f. [8]. The decision 
problem in our setting is the following: 

Problem 2 {6 f ~ Decision Problem) 

Given two polygonal curves P,Q, a class of transformations % and e>0. 
Decide, whether there exists a t G T such that Sf{t{P), Q) < e. 

We will show that in the case of translations we can solve the decision problem 
in 0((mn)^(m + n)^) time. The parametric search adds only a logarithmic over- 
head, since we can apply Cole’s trick for parametric search based on sorting, so 
we can solve the optimization problem in 0((mn)^(m -I- n)^ log(m -I- n)) time. 

^ We recently learned that Efrat et al. [7] have independently developed an algorithm 
for the decision problem under translations. However, the runtime they achieve is 
by a quadratic factor slower than ours, and their result is rather complicated and 
relies on complex data structures. 
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2 Computing the Frechet Distance 

Throughout the rest of the paper let P \ [0,m] — > and Q ■. [0, n] — > be 

polygonal curves. Unless stated otherwise e > 0 is a fixed real parameter. In the 
sequel we will use the notion of a free space which was introduced in [4] : 

Definition 3 (Free Space, [4]) The set F^{P,Q) := {(s,t) G [0,m] x [0,n] | 
||P(s) — <5(t)|| < e}, or Fe for short, denotes the free space of P and Q. 

Sometimes we refer to [0, m] x [0, n] as the free space diagram] the feasible points 
p G Fe will be called ‘white’ and the infeasible points p G [0,m] x [0,n] — F^ 
will be called ‘black’ (for obvious reasons, c.f. Figure 1). Consider [0,m] x [0,n] 
as composed of the mn cells Cij := — 1, *] x [j — 1, j] 1 < i < n, 1 < j < 

m. Then F,,{P,Q) is composed of the mn free spaces for each pair of edges 
Q,_i) = F,{P, Q) n Cij. 

The following results from [4] describe the structure of the free space and 
link it to the problem of computing dp- 

Lemma 4 (Alt/Godau, [4]) The free space of two line segments is the inter- 
section of the unit square with an affine image of the unit disk, i.e., with an 
ellipse, possibly degenerated to the space between two parallel lines. 

Lemma 5 (Alt/Godau, [4]) For polygonal curves P and Q we have Sf{P,Q) 
< e, exactly if there exists a curve within F^{P,Q) from (0,0) to (rn,n) which is 
monotone in both coordinates. 

For proofs of the above two Lemmas see [4]. Figure 1 shows polygonal curves 
P, Q, a distance e, and the corresponding diagram of cells Cij with the free space 
Fe. Observe that the curve as a continuous mapping from [0, 1] to [0,m] x [0,n] 
directly gives feasible reparametrizations, i.e., two reparametrizations a and (3, 
such that maxtg[o,i]||/(a(t)) - g{P{t))\\ < e. 




Fig. 1. Two polygonal curves P and Q and their free space diagram for a given e. An 
example monotone curve in the free space (c.f. Lemma 5) is drawn bold. 

For (i,j) G {1, . . . , to} X {1, . . . , n} let Lfj := {i - 1} x [a,j, bij] (or := 
[cij,dij] X {j — 1}) be the left (or bottom) line segment bounding Cij C (see 
Figure 2). 
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‘=i,3 + l + l 




Fig. 2. Intervals of the free space on the boundary of a cell. 



By induction it can easily be seen that those parts of the segments Lfj and 
Bfj which are reachable from (0, 0) by a monotone path in Fg are also line 
segments. Using a dynamic programming approach one can compute them, and 
thus decide if Sf{P,Q) < £• For details we refer the reader to the proof of the 
following theorem in [4]: 

Theorem 6 (Alt / Godau, [4] ) For given polygonal curves P, Q and e > 0 one 
can decide in 0{mn) time, whether Sf{P,Q) < e. 

Now let us observe a continuity property of F^: As we have already mentioned, 
each (possibly clipped ellipse) in is the affine image of a unit disk. Thus each 
ellipse in F,, varies continuously in e. This implies the following observation: 

Observation 7 (See [4]) If e = 6 f{P,Q), then F^ contains at least one mono- 
tone path from (0,0) to (m,n) and for each such path tt one of the following 
cases occurs: 

a) Lfj or Bfj is a single point on tt for some pair {i,j). (The path passes 
through a passage between two neighboring cells that consists of a single 
point.) 

b) aij = bkj (or Cij = di^k) for some i,j,k and tt passes through (i,aij) and 
(k,bk,j) (or TT passes through (cij,j) and (di^k,k)). (The path contains a 
‘clamped’ horizontal or vertical passage, see Figure 3.) 

Figure 4 shows the geometric situations that correspond to these two cases. 
In case a) the reparametrization maps the point P{i — 1) to the only point on 
the edge Qj that has distance e from P{i — 1). In case b) it maps the part 
of P between P{i — I) and P{k — I) to the only point on the edge Qj that 
has distance e from P{i — 1) and P{k — 1). This situation covers the case of 
horizontally clamped paths. The geometric situations that involve a vertically 
clamped passage are similar, with the roles of P and Q interchanged. Note that 
we can actually view case a) as a special case of case b) with i = k. 
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Fig. 3. The path contains a ‘clamped’ horizontal passage in the j-th row between the 
spikes A and B. 




(a) 



(b) 



Fig. 4. The geometric situations corresponding to a horizontally clamped path. 



3 Minimizing the Frechet Distance 



First we give a rough sketch of the basic idea of our algorithm: Assume that 
there is at least one translation that moves P to a Frechet distance at most e 
to Q. Then we can move P to a position r= where the Frechet distance to Q 
is exactly e. According to Observation 7 the free space diagram F^{t^{P),Q) 
then contains at least one clamped path. As a consequence, one of the geometric 
situations from Figure 4 must occur. Therefore the set of translations that attain 
a Frechet distance of exactly e is a subset of the set of translations that realize 
at least one of those geometric situations. The set of translations that create a 
geometric situation involving the two different vertices P{i — 1) and P(fc — 1) 
from P and the edge Qj from Q consist of two segments in transformation space, 
i.e., it can be described geometrically. 

Now assume that the geometric situation from above is specified by the two 
vertices P(i — 1) and P{k — 1) and the edge Qj. When we move P in such a way 
that P{i — 1) and P(fc — 1) remain at distance e from a common point on an 
edge of Q (i.e., we shift P ’along’ Q), we will preserve one geometric situation 
(namely the one involving P{i — 1) and P{k — 1) and some edge of Q). At some 
point however, we will reach a placement where the Frechet distance becomes 
larger than e. This means that immediately before that point it was exactly e, 
so there is a second placement tL such that Fe{TL{P),Q) contains at least one 
clamped path and consequently another geometric situation must occur. So the 
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set of translations that attain a Frechet distance of exactly e is a subset of the 
set of translations that realize at least two such geometric situations. 

After this informal description of the basic ideas let us go into more detail 
now: 

Convention: In this section T 2 denotes the group of planar translations, P and 
Q are polygonal curves with m and n vertices, respectively and e > Q is a real 
parameter. 

A translation r = {{x, y) [x + 5x,y + 5y)) G T 2 can be specified by the pair 
{Sx,Sy) G of parameters. The set of parameters of all translations in T 2 is 
called the parameter space of 72, or translation space for short, and we identify 
?2 with its parameter space. 

Let us now take a look at the free space Q) when r varies over 72. 

Now we show that each of the 0{mn) ellipses (and thus also each clipped ellipse) 
varies continuously in r G 72. In fact we consider all mn ellipses, even those 
that have an empty intersection with their corresponding square in the diagram 
(let us call these ’invisible’). Note that an ellipse is generated by two linearly 
independent line segments; one from P and one from Q. Parallel line segments 
generate only a ’degenerate ellipse’, namely the space between two parallel lines. 
So if we fix a translation r G 72 it is easy to see that each (possibly invisible) 
ellipse in Fe{T{P),Q) is a translation of the corresponding ellipse in Fe{P,Q). 
In fact, the translation is (— A,/i) where A and pL are the coefficients which are 
obtained by representing r as a linear combination of the direction unit vectors 
of the line segments. Thus each ellipse varies continuously in r G 72. Note that 
this is still true if an ellipse is visible but its translate is invisible or vice versa. 
A similar argument holds for degenerate ellipses. 

Definition 8 (Configuration) A triple (p,p',s) that consists of two (not nec- 
essarily different) vertices p and p' of P and an edge s of Q is called an h- 
configuration. v-configurations are defined analogously with the roles of P and 
Q exchanged. A configuration is an h- or v- configuration. 



Definition 9 (Critical Translations) Let c = {x,y,s) be an h- configuration 
and c' = {x',y',s') be a v- configuration. The sets 

Tcrit{c) := {r G 7^ I G s : ||r(x) - z\ \ = \\T{y) - z\\ = e] and 
Tcruic') := {r G T 2 I G s' : ||x' - t{z')\\ = \\y' - t{z')\\ = e} 

are called the sets 0 / critical translations for c and c' . A translation is called 
critical if it is critical for some configuration. 



Lemma 10 If Sf{t{P),Q) = e, then r is critical. 

Proof. By Observation 7 there is a path tt in F„{t{P),Q) for which case a) or 
b) occurs. If the corresponding geometric situation (c.f. Figure 4) involves the 
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vertices T{P{i — 1)) and T{P{k — 1)) on t{P) and a point on the edge Qj then 
the translation r is critical for the h-configuration {P{i— 1), P{k— 1), Qj). If the 
geometric situation involves vertices from Q and a segment of t(P), the same 
argument yields a v-configuration. □ 

Note that the condition in Lemma 10 is only necessary but not sufficient, i.e., 
there are indeed critical translations r with Sf{t{P),Q) yf e. This is because a 
critical translation for a configuration (x, y, s) does not even have to map the 
part of the curve between x and y within distance e to the corresponding point 
on s. 

Let us now take a closer look at the critical translations in 7^: For a given 
configuration (x, y, s) with two different vertices (which corresponds to case (b) 
in Figure 4) the set of critical translations is described by two parallel line 
segments in translation space, where each line segment is a translate of s. If the 
two vertices in the configuration are the same (which is case (a) in Figure 4) 
the set of critical translations is described by a ’racetrack’ in translation space, 
which is the locus of points having distance e to a translate of s. Note that a 
’racetrack’ consists of line segments and circular arcs. 

We call the arrangement in translation space consisting of the curves de- 
scribing all critical translations of all configurations the arrangement of critical 
translations. There are 0(^mn{m + n)) different configurations, so the combina- 
torial complexity of the arrangement of critical translations (i.e., the number of 
vertices line segments and circular arcs) is 0((mn(m -I- n.))^). 

Lemma 11 If there is a translation t< € T 2 such that Sf{t<{P),Q) < e then 
there is a translation t= € T 2 that is critical such that 5f{t={P),Q) = e. 

Proof. Pick any translation r> G 72 such that Sf{t>{P),Q) > e. By continuity, 
there exists a translation on any curve between r< and r> in translation space 
such that Sf{t={P), Q) = e. By Lemma 10 the translation r= is critical. □ 

This result states that, whenever there is some translation t< that moves P 
into Frechet distance at most e to Q, there is also a ‘canonical’ translation r= 
that results in a Frechet distance exactly e and that lies on the arrangement of 
critical translations. So in order to check if there is a translation that moves P 
into Frechet distance at most e to Q, it is sufficient to check all translations on 
the arrangement of critical translations. 

However, since the translation space has more than one degree of freedom, the 
arrangement of critical translations contains an infinite number of translations. 
So our observation does not help from an algorithmic point of view. Lemma 14 
shows that we can restrict our attention to the zero-dimensional parts of the 
arrangement, i.e., intersection points and endpoints of the curves describing the 
critical translations. First we need the following two observations: 

Observation 12 Let c = {P{i — l),P{k — l),Qj), with i ^ k, be an h-con- 
figuration. Then Oij = bkj in Fe{T{P),Q) for all r G Tcrit{c), i.e., the relative 
position of the two spikes stays the same for all r G Tcrit{c) (c.f Figures 3 and 
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Observation 13 Let c= {P{i—l),P{k— l),Qj), with i ^ k, be an h-configura- 
tion. Now we consider a feasible reparametrization for some r= G Tcrit{c), that 
maps the part of t^{P) between T={P{i — 1)) and T^{P{k — 1)) to a point on 
Qj. This corresponds to a path in Fg{T={P),Q) that is clamped between the 
two corresponding vertical spikes in cell (i,j) and (k,j) of Fe{T^{P),Q). Now 
from Observation 12 it follows that for each r G Tcrit{c) the relative position of 
the spikes does not change, i.e., we cannot ‘destroy’ locally by moving along 
Tcriti,^) ■ 

Of course both observations remain true if we consider h-configurations that 
correspond to case a) of Observation 7 (where i = k) or v-configurations (where 
the roles of P and Q are interchanged). 

Lemma 14 If there is a translation t< G T 2 such that Sf{t<{P),Q) < e then 
there is a translation t= GT2 that lies on a vertex of the arrangement of critical 
translations such that 5f{t={P),Q) = e. 

Proof. Suppose all vertices of the arrangement of critical translations yield a 
Frechet distance greater than e. By Lemma 11 there is a critical translation 

G T 2 with 5f{t={P),Q) = e, and by definitions 8 and 9 there is a configu- 
ration c such that r= G Tcrit{c). Now pick any translation r> G 72 such that 
5f{t>{P),Q) > €. We can assume without loss of generality that r= lies in 
an ’extreme’ position on Tcrit{c), which means that 5f{t{P),Q) > e for every 
T G Tcrit{c) that lies ’between’ and r>. Considering the free space diagram 
this means that Fe{T={P),Q) contains a monotone path, but Fe{T{P),Q) does 
not contain this or any other monotone path anymore. By continuity this can 
only happen, if each monotone path in F^{t^{P),Q) is ’clamped’ between two 
’spikes’ which close the narrow passage in the free space when moving from r= 
to T> on T critic). 

But according to Observation 12 this cannot be true for the spikes corre- 
sponding to the critical translations Tcrit{c). Thus there must be another config- 
uration c' such that G Tcrit(c’), and close to r= the curve describing Tcrit(c') 
differs from Tcrit(c). Since both curves are algebraic, Tcrit{c) H Tcrit{c') is zero- 
dimensional, and thus a vertex of the arrangement. □ 

So in order to solve the decision problem for a given e it is sufficient to check 
for all translations r that correspond to vertices of the arrangement of critical 
translations whether 6f{t{P),Q) < e. We thus have altogether 0((mn)^(m -I- 
n)^) translations for each of which we check in 0{mn) time if it brings P into 
distance at most e to Q, which solves Problem 4 for the case of translations and 
yields the following theorem: 

Theorem 15 For given polygonal curves P, Q and e > 0 one can decide in 
0{fmn)^{m + n)^) time whether there is a translation t G T 2 such that 
Sf{t{P),Q) < e. 

In order to find a translation that minimizes the Frechet distance between 
the two polygonal curves we apply the parametric search paradigm. For this 
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we generalize the approach of [4] . Remember that for a given configuration c = 
{x,y,s) the set of critical translations Tcrit{c) is described by two parallel line 
segments or by a ’racetrack’ in translation space. Now when we let e vary Tcritic) 
changes accordingly, namely the distance between the parallel line segments or 
the radius of the ’racetrack’ varies depending on e. Note that for small e, Tcritic) 
might even be empty, which happens for example when ||a; — j/|| < e. 

For a given e let S'(e) be the set of 0((mn(m + n))^) vertices of the arrange- 
ment of critical translations. In fact, one can track each vertex in S'(e) for varying 
e, i.e., one can interpret each vertex in S'(e) as a function of e. Let S be the set of 
these vertex-functions. Note that the vertex-functions in S might not be defined 
for small e. For each of the 0((mn(m -I- n))^) translation functions r(e) in S we 
compute the free space Fe(r(e)(P), Q) depending on e. In fact we only compute 
all aij(r, e), bij{T,e), Cij (r, e), and dijiT,e) which depend on e and r, and of 
which there are 0(^{mn)^{m + n)^). For the parametric search an e is critical 
if two of these functions have the same value (for the same translation function 
r). A parametric search over all O ((mn)^ {m + n)“^) values of aij{T,e), bij{T,e), 
Cijir, e), and dij{T, e) thus yields an optimum e together with an optimum trans- 
lation. As in [4] we apply a parallel sorting algorithm which generates a superset 
of the critical values of e we need. By utilizing Cole’s trick [6] for parametric 
search based on sorting, which in general yields a runtime of 0((fc -I- Tgeq) log k) 
where Tgeq is the sequential runtime for the decision problem and k is the number 
of values to be sorted, we obtain a runtime of 0(^{rnn)^{m + n)^log{m+n)). This 
solves Problem 3 for the case of translations and proves the following theorem: 

Theorem 16 For given polygonal curves P, Q one can compute a translation 
'^min (P i^imn) im n) logim — t— n)') txme^ such that dpirmin iP),Q) = 
minTgr2 



3.1 Other Transformation Classes 

We are currently investigating the application of the techniques from above 
to other classes of transformations, such as translations in a fixed direction, 
rotations around a fixed center, rigid motions, and arbitrary affine maps, for 
matching curves in two and higher dimensions. 

In the parameter space of the transformation class under consideration the 
set of critical transformations for a configuration is a semi-algebraic set in gen- 
eral, which is defined by a constant number of polynomials of bounded degree. 
Therefore we can define the arrangement of critical transformations in the same 
way as before. 

A suitable generalization of Lemma 14 should imply that only the zero- 
dimensional pieces of this arrangement have to be considered as candidates for 
a successful match. This immediately yields an algorithm with a runtime that 
depends on the complexity of the arrangement of critical transformations, which 
in turn depends on the dimension of the parameter space as well as on the 
dimension of the underlying Euclidean space. 
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4 Approximately Minimizing the Prechet Distance 

The algorithms we described so far cannot be considered to be efficient. To rem- 
edy this situation, we present approximation algorithms which do not necessar- 
ily compute the optimal transformation, but one that yields a Frechet distance 
which differs from the optimum value by a constant factor only. To this end, we 
generalize the notion of a reference point, c.f. [2] and [1], to the Frechet metric 
and observe that all reference points for the Hausdorff distance are also reference 
points for the Frechet distance. 

We first need the concept of a reference point that was introduced in [1]. A 
reference point of a figure is a characteristic point with the property that similar 
figures have reference points that are close to each other. Therefore we get a 
reasonable matching of two figures if we simply align their reference points. 

Definition 17 (Reference Point, [1]) Let 1C he a set of compact subsets of 
and 5 he a metric on 1C. A mapping r : /C ^ R.^ zs called a J-reference point 
for K. of quality c > 0 with respect to a set of transformations T on /C, if the 
following holds for any two sets P,Q € 1C and each transformation r G T: 

(Equivariance) r(r(P)) = r(r(P)) (1) 

(Lipschitz continuity) ||r(P) — r(Q)|| < c • 5(P, Q). (2) 

In other words a reference point is a Lipschitz-continuous mapping between 
the metric spaces (/C, 6 ) and (R^, || • ||) with Lipschitz constant c, which is equiv- 
ariant under T. Various reference points are known for a variety of distance 
measures and classes of transformations, like, e.g., the centroid of a convex poly- 
gon which is a reference point of quality 11/3 for translations, using the area of 
the symmetric difference as a distance measure, see [3] . However, most work on 
reference points has focused on the Hausdorff distance, see [1]. 

Definition 18 (Hausdorff Distance) Let P and Q be curves. Then Sh{P, Q) 
denotes their Hausdorff distance, defined as 

Sh{P,Q) := max(5//(P, Q),fe((5, P)), with 

5h{X,Y) := sup inf ||a; — y||, the one-sided Hausdorff distance from X to Y. 
xex 



We will only mention the following result that provides a ^//-reference point 
for polygonal curves with respect to similarities, the so called Steiner point. The 
Steiner point of a polygonal curve is the weighted average of the vertices of the 
convex hull of the curve, where each vertex is weighted by its exterior angle 
divided by 27 t. 

Theorem 19 (Aichholzer et al., [1]) The Steiner point is a Sn-reference 
point with respect to similarities of quality 4/tt. Lt can he computed in linear 
time. 
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Note that the Steiner point is an optimal ^//-reference point with respect to 
similarities, i.e., the quality of any J//~reference point for that transformation 
class is at least 4/7 t, see [1]. 

Two feasible reparametrizations a and (3 oi P and Q demonstrate, that for 
each point P{a{t)) there is a point Q{P{t)) with ||P(a;(t)) — Q{/3{t))\\ < e (and 
vice versa), thus 6h{P, Q) < 6f{P,Q)- This shows the following observation: 

Observation 20 Let c > 0 be a constant and T he a set of transformations on 
K.. Then each 6 h - reference point with respect to T is also a 6p -reference point 
with respect to T of the same quality. 

This shows that we can use the known (^//-reference points to obtain <5/7— reference 
points. However, since each reparametrization has to map P(0) to <5(0), the dis- 
tance ||P(0) — Q(0)|| is a lower bound for Sf{P,Q). So we get a new reference 
point that is substantially better than all known reference points for the Haus- 
dorff distance. 

Observation 21 Let Co be the set of all planar curves. The mapping 

p(o) 

is a Sf - reference point for curves of quality 1 with respect to translations. 

The quality of this reference point, i.e., 1, is better than the quality of the 
Steiner point, which is I/tt. Since the latter is an optimal reference point for the 
Hausdorff distance, this shows that for the Frechet distance substantially better 
reference points exist. For closed curves however ro is not defined at all. 

Based on the existence of a <5/7— reference point for T 2 we obtain the following 
algorithm for approximate matchings with respect to the Frechet distance under 
the group of translations, which is the same procedure as already used in [1] for 
the Hausdorff distance. 

Algorithm T: Compute r(P) and r(Q), translate P by r := r((5) — r(P), and 
output this matching as the approximate solution, together with Sf(t{P),Q). 

Theorem 22 Suppose that r is a Sf - reference point of quality c with respect to 
translations that can he computed in 0{Tr{n)) time. Then algorithm T produces 
a {c+ 1)- approximation to Problem 1 in 0{mn + Tr{m) + T^{nf) time. 

Proof. Let Topt be a translation, such that min,- 5i7’(r(P), Q) = 5F{Topt{P),Q). 
Then 

\\r{Topt{P)) -r(Q)|| < c- SF{Topt{P),Q). 

Let TMff ■= r{Topt{P)) - r{Q) G 7^; then 

'^approx • — '^opt 

maps r(P) onto r(Q) and 

SF{Tapprox{P),Q) < S F{Topt{P) , Q) + \ \Tdiff\ \ < (c -f 1) ' <5_F (Topt (P) , Q) . 

The proof of the claimed time bound is obvious. □ 
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Note that with an idea from [9] it is possible to reduce the approximation 
constant for reference point based matching to (1 + e) for any e > 0; the idea 
places a sufficiently small grid of size 0(l/e^) around the reference point of Q 
and checks each grid point as a potential image point for the reference point of 
P. The runtime increases by a factor proportional to the grid size. 
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Abstract. It is an open problem to characterize the class of languages 
recognized by quantum finite automata (QFA). We examine some neces- 
sary and some sufficient conditions for a (regular) language to be recog- 
nizable by a QFA. For a subclass of regular languages we get a condition 
which is necessary and sufficient. 

Also, we prove that the class of languages recognizable by a QFA is not 
closed under union or any other binary Boolean operation where both 
arguments are significant. 



1 Introduction 

A 1-way quantum finite automaton (QFA)^ is a theoretical model for a quantum 
computer with a finite memory. 

Compared to classical (non-quantum) automata, QFAs have both strengths 
and weaknesses. The strength of QFAs is shown by the fact that quantum au- 
tomata can be exponentially more space efficient than deterministic or prob- 
abilistic automata [AF 98]. The weakness of QFAs is caused by the fact that 
any quantum process has to be reversible (unitary). This makes QFAs unable to 
recognize some regular languages. 

The first result of this type was obtained by Kondacs and Watrous [KW 97] 
who showed that there is a language that can be recognized by a deterministic 
finite automaton (DFA) but cannot be recognized by QFA. Later, Brodsky and 
Pippenger [BP 99] generalized the construction of [KW 97] and showed that 
any regular language that does not satisfy the partial order condition cannot be 
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recognized by a QFA. They also conjectured that all regular languages satisfying 
the partial order condition can be recognized by a QFA. 

In this paper, we disprove their conjecture. We show that, for a language to 
be recognizable by a QFA, its minimal deterministic automaton must not contain 
several “forbidden fragments” . One of fragments is equivalent to the automaton 
not satisfying the partial order condition. The other fragments are new. 

A somewhat surprising feature of our “forbidden fragments” is that they 
consist of several parts (corresponding to different beginnings of the word) and 
the language corresponding to every one of them can be recognized but one 
cannot simultaneously recognize the whole language without violating unitarity. 

Our result implies that the set of languages recognizable by QFAs is not closed 
under union. In particular, the language consisting of all words in the alphabet 
{a, 6} that have an even number of a’s after the first b is not recognizable by a 
QFA, although it is a union of two recognizable languages. (The first language 
consists of all words with an even number of a’s before the first b and an even 
number of a’s after the first b, the second language consists of all words with an 
odd number of a’s before the first b and an even number of a’s after it.) This 
answers a question of Brodsky and Pippenger [BP 99] . 

For a subclass of regular languages (languages that do not contain ’’two 
cycles in a row” construction shown in Fig. 3), we show that our conditions 
are necessary and sufficient for a language to be recognizable by a QFA. For 
arbitrary regular languages, we only know that these conditions are necessary 
but we do not know if all languages satisfying them can be recognized by a QFA. 

Due to space constraints of these proceedings, most of proofs are omitted. 



1.1 Definitions 

Quantum finite automata (QFA) were introduced independently by Moore and 
Crutchfield [MC 97] and Kondacs and Watrous [KW 97] . In this paper, we con- 
sider the more general definition of QFAs [KW 97] (which includes the definition 
of [MC 97] as a special case). 

Definition 1.1. A QFA is a tuple M = {Q; S;V; qo] Qacc] Qrej) where Q is a 
finite set of states, E is an input alphabet, V is a transition function ( explained 
below), qo&Q is a starting state, and Qacc Q Q and Qrej Q Q are sets of 
accepting and rejecting states (Qacc C Qrej = The states in Qacc and Qrej, 
are called halting states and the states in Qnon = Q — {Qacc U Qrej) are called 
non halting states. 

States of M. The state of M can be any superposition of states in Q (i. e., any 
linear combination of them with complex coefficients). We use (g) to denote the 
superposition consisting of state q only. hiQ) denotes the linear space consisting 
of all superpositions, with ^ 2 -distance on this linear space. 

Endmarkers. Let n and $ be symbols that do not belong to S. We use k and 
$ as the left and the right endmarker, respectively. We call F = A U {k; $} the 
working alphabet of M . 
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Transition Fnnction. The transition function is a mapping from T x hiQ) 
to hiQ) such that, for every aGP, the function Va ■ hiQ) — *■ h{Q) defined by 
Va(x) = V(a,x) is a unitary transformation (a linear transformation on hiQ) 
that preserves I 2 norm). 

Computation. The computation of a QFA starts in the superposition |go)- Then 
transformations corresponding to the left endmarker k, the letters of the input 
word X and the right endmarker $ are applied. The transformation corresponding 
to a&r consists of two steps. 

1. First, Va is applied. The new superposition 'll;' is Va{'ip) where ip is the 
superposition before this step. 

2. Then, is observed with respect to Eacc, E^ej, Enon where Eacc = span{\q) 
q^Qacc\-! Ej‘aj — Spd'n\^\ql . q^Qrej\ j E^ian — SpQ.77.{|(7) . q^Qnon\- ff the State 
before the measurement was 

V”' = XI I®*) + X + X 

qi&Q acc Qj^Qrej Qk^Qnon 

then the measurement accepts tj;' with probability Pa = rejects with prob- 
ability Pr = SPj and continues the computation (applies transformations corre- 
sponding to next letters) with probability Pc = with the system having the 
(normalized) state where 'll; = Ejk |<Zfe)- 

We regard these two transformations as reading a letter a. 

Unnormalized States. Normalization (replacing 'll; by ||:^) is needed to make 
the probabilities of accepting, rejecting and non-halting after the next letter sum 
up to 1. However, normalizing the state after every letter can make the notation 
quite messy. (For the state after k letters, there would be k normalization factors 

For this reason, we do not normalize the states in our proofs. That is, we 
apply the next transformations to the unnormalized state 'ip instead of 

There is a simple correspondence between unnormalized and normalized 
states. If, at some point, the unnormalized state is 'ip, then the normalized state 
is and the probability that the computation has not stopped is ||V'IP- The 
sums Pa = Eaf and Pr = are the probabilities that the computation has 
not halted before this moment but accepts (rejects) at this step. 

Notation. We use Va to denote the transformation consisting of Va followed by 
projection to Enon- This is the transformation mapping 'ip to the non-halting 
part of Vai'ip). We use VP^ to denote the product of transformations VP = 
where di is the t-th letter of the word w. 

We also use 'ipw to denote the (unnormalized) non-halting part of QFA’s state 
after reading the left endmarker k and the word wGE*. From the notation it 
follows that 'Ipni = VP,n,i\<lo))- 

Recognition of Languages. A QFA M recognizes a language L with proba- 
bility p (p > |) if it accepts any word x&L with probability > p and rejects any 
word x^L with probability > p. If we say that a QFA M recognizes a language 
L (without specifying the accepting probability), this means that M recognizes 
L with probability \ + t for some e > 0. 
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1.2 Previous Work 

The previous work on quantum automata has mainly considered 3 questions: 

1 . What is the class of languages recognized by QFAs? 

2. What accepting probabilities can be achieved? 

3. How does the size of QFAs (the number of states) compare to the size of 
deterministic (probabilistic) automata? 

In this paper, we consider the first question. The first results in this direction 
were obtained by Kondacs and Watrous [KW 97]. 

Theorem 1.1. [KW 97] 

1. All languages recognized by QFAs are regular. 

2. There is a regular language that cannot he recognized by a QFA. 

Brodsky and Pippenger [BP 99] generalized the second part of Theorem 1.1 
by showing that any language satisfying a certain property is not recognizable 
by a QFA. 

Theorem 1.2. [BP 99[ Let L he a language and M he its minimal automaton 
(the smallest DFA recognizing L). Assume that there are words x and y such 
that M contains states qi, <72 satisfying: 

1- qi + 92, 

2. If M starts in the state qi and reads x, it passes to < 72 , 

3. If M starts in the state <72 and reads x, it passes to < 72 , and 
4- If M starts in <72 and reads y, it passes to qi, 

then L cannot he recognized by a quantum finite automaton(Fig.l). 




Fig. 1. Conditions of Theorem 1.2 



A language L with the minimal automaton not containing a fragment of The- 
orem 1.2 is called satisfying the partial order condition [MT 69]. [BP 99] conjec- 
tured that any language satisfying the partial order condition is recognizable by 
a QFA. In this paper, we disprove this conjecture. 
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Another direction of research studies the accepting probabilities of QFAs. 
First, Ambainis and Freivalds [AF 98] proved that the language a*b* is recog- 
nizable by a QFA with probability 0.68... but not with probability 7/9-|-e for any 
e > 0. Thus, the classes of languages recognizable with different probabilities are 
different. Next results in this direction were obtained by [ABFK 99] who studied 
the probability with which the languages a]] ... a* can be recognized. 

There is also a lot of results about the number of states needed for QFA to 
recognize different languages. In some cases, it can be exponentially less than for 
deterministic or even for probabilistic automata [AF 98, K 98] . In other cases, it 
can be exponentially bigger than for deterministic automata [ANTV 98, N 99]. 
A good survey about quantum automata is Gruska [G 00] . 

2 Main Results 

2.1 Necessary Condition 

First, we give the new condition which implies that the language is not recog- 
nizable by a QFA. Similarly to the previous condition (Theorems 1.2), it can 
be formulated as a condition about the minimal deterministic automaton of a 
language. In Section 3, we will give an example of a language that satisfies the 
condition of Theorem 2.1 but not the previously known condition of Theorem 

1.2 (the language Li). 

Theorem 2.1. Let L he a language. Assume that there are words x, y, z\, Z 2 
such that its minimal automaton M contains states qi, < 72 , <73 satisfying: 

1- <?2 7 ^ qz, 

2. if M starts in the state qi and reads x, it passes to < 72 , 

3. if M starts in the state <72 and reads x, it passes to < 72 , 

4- if M starts in the state qi and reads y, it passes to < 73 , 

5. if M starts in the state <73 and reads y, it passes to < 73 , 

6. for any word t G (x\y)* there exists a word ti G (x\y)* such that if M 
starts in the state <72 and reads tti, it passes to < 72 , 

7. for any word t G {x\y)* there exists a word ti G (x\y)* such that if M 
starts in the state (73 and reads tt\, it passes to qs, 

8. if M starts in the state <72 and reads zi, it passes to an accepting state, 

9. if M starts in the state (72 and reads Z 2 , it passes to a rejecting state, 

10. if M starts in the state (73 and reads zi, it passes to a rejecting state, 

11. if M starts in the state (73 and reads Z 2 , it passes to an accepting state. 
Then L cannot he recognized by a QFA. 

Proof. We use lemmas from [BV 97] and [AF 98]. 

Lemma 2.1. [BV 97] If i] and </> are two quantum states and ]]')/' — ^i’|| < e then 
the total variational distance between probability distributions generated by the 
same measurement on i] and <j) is at mosf^ 2e. 

The lemma in [BV 97] has 4e but it can be improved to 2e. 



2 
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Fig. 2. Conditions of Theorem 2.1, Conditions 6 and 7 Are Shown Symbolically 



Lemma 2.2. [AF 98] Let x € . There are subspaces E\, E 2 such that Enon = 

El 0 E 2 and 

(i) If Ip G £^i, then Vf^ip) G Ei and HK'WII = IIV’II, 

(ii) If Ip & E 2 , then ||V^fc('?/’)ll ^ 0 when k 00 . 

Lemma 2.2 can be viewed as a quantum counterpart of the classification of 
states for Markov chains [KS 76]. The classification of states divides the states 
of a Markov chain into ergodic sets and transient sets. If the Markov chain is 
in an ergodic set, it never leaves it. If it is in a transient set, it leaves it with 
probability 1 — e for an arbitrary e > 0 after sufficiently many steps. 

In the quantum case, E\ is the counterpart of an ergodic set: if the quantum 
random process defined by repeated reading of a; is in a state 1 /' G ifi, it stays 
in El. E 2 is a counterpart of a transient set: if the state is V' G E 2 , E 2 is left 
(for an accepting or rejecting state) with probability arbitrarily close to 1 after 
sufficiently many x’s. 

The next Lemma is our generalization of Lemma 2.2 for the case of two 
different words x and y. 

Lemma 2.3. Let x,y G . There are subspaces Ei, E 2 such that Enon = 
El 0 E 2 and 

(i) If tp & El, then Vfpip) G Ei and V](pp) G Ei and ||V]^(V’)II = ll'*/’l| o.'^'d 

\\v]m = ui 

(a) If & E 2 , then for any e > 0, there exists t G (x\y)* such that ||V)'(7/’)|| < e. 

Proof. Omitted. □ 

Let L be a language with its minimal automaton M containing the ’’forbidden 
construction” and Mq be a QFA. We show that Mq cannot recognize L. 

For a word w, let -pw = P’w + Pi,, Pi, G Ei, pl, G E 2 . 






On the Class of Languages 



81 



Fix a word w after reading which M is in the state qi. We find a word 
a e {x\y)* such that after reading xa M is in the state <72 and the norm of 
i’wxa = ^ai'^wx) is at most some fixed e > 0. (Such word exists due to Lemma 
2.3 and conditions 6 and 7.) We also find a word b such that HV'Sjyfcll — 

Because of unitarity of and Vy on Ei (part (i) of Lemma 2.3), there exist 
integers i and j such that - V'ill < e and \\^l^yi,y - V'iH < e. 

Let p be the probability of Mq accepting while reading kw. Let pi be the 
probability of accepting while reading (xa)* with a starting state ipwj P2 be the 
probability of accepting while reading (yb)^ with a starting state ipw and p^, p 4 
be the probabilities of accepting while reading zi$ and Z 2 % starting at i/'i,- 
Let us consider four words Kw{xaY zi$, Kw{xaY Z2$, Kw{yby zi$, Kw{yby Z2$. 

Lemma 2.4. Mq accepts Kw{xaYzi$ with probability at least p + pi + ps — 4e 
and at most p + pi + Ps + 4e. 

Proof. The probability of accepting while reading kw is p. After that, Mq is in 
the state Yui and reading (xa)* from ipw causes it to accept with probability pi . 

The remaining state is ip^u(xay = Yl,{xaY + '^l,{xay tpl, the proba- 

bility of accepting while reading the rest of the word (zi$) would be exactly p^. 
It is not quite iplj but it is close to i/'i,- Namely, we have 

WYwixay - YlW < WYlixayW + Uyxay-YwW < e-he= 2e. 

By Lemma 2.1, the probability of accepting during zi$ is between p^ — 4e and 
Pd. + 4e. □ 

Similarly, on the second word Mq accepts with probability between p -I- pi -I- 
P 4 — 4e and p -I- pi -I- P 4 -I- 4e. On the third word Mq accepts with probability 
between p -I- P2 -f Ps — 4e and p -I- P 2 + Ps + 4e. On the fourth word Mq accepts 
with probability p -I- p 2 -I- P 4 — 4e and p -I- P 2 + P 4 + 4e. 

This means that the sum of accepting probabilities of two words that belong 
to L (the first and the fourth) differs from the sum of accepting probabilities of 
two words that do not belong to L (the second and the third) by at most 16e. 
Hence, the probability of correct answer of Mq on one of these words is at most 
2 -I- 4e. Since such 4 words can be constructed for arbitrarily small e, Mq does 
not recognize L. □ 

2.2 Necessary and Sufficient Condition 

For languages whose minimal automaton does not contain the construction of 
Figure 3, this condition (together with Theorem 1.2) is necessary and sufficient. 

Theorem 2.2. Let U be the class of languages whose minimal automaton does 
not contain ’’two cycles in a row” (Fig. 3). A language that belongs to U can 
be recognized by a QFA if and only if its minimal deterministic automaton does 
not contain the ’’forbidden construction” from Theorem 1.2 and the ’’forbidden 
construction” from Theorem 2.1. 
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Fig. 3. Conditions of Theorem 2.2 

3 Non-closure under Union 

In particular, Theorem 2.1 implies that the class of languages recognized by 
QFAs is not closed under union. 

Let Li be the language consisting of all words that start with any number of 
letters a and after first letter b (if there is one) there is an odd number of letters 
a. Its minimal automaton Gi is shown in Fig. 4 . 




Fig. 4. Automaton Gi 



This language satisfies the conditions of Theorem 2 . 1 . (gi, q2 and gs of The- 
orem 2.1 are just gi, g2 and gs of Gi. x, y, zi and Z2 are b, aba, a and 6.) Hence, 
it cannot be recognized by a QFA. 

Consider 2 other languages L2 and L3 defined as follows. 

L2 consists of all words which start with an even number of letters a and 
after first letter & (if there is one) there is an odd number of letters a. 

L3 consists of all words which start with an odd number of letters a and after 
first letter & (if there is one) there is an odd number of letters a. 

It is easy to see that Li = L2 (J L3. 

The minimal automata G2 and G3 are shown in Fig . 5 and Fig. 6. They do 
not contain any of the “forbidden constructions” of Theorem 2 . 2 . Therefore, L2 
and L3 can be recognized by a QFA and we get 
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Theorem 3 . 1 . There are two languages L2 and L3 which are recognizable by a 
QFA but the union of them Li = L2 IJ L3 is not recognizable by a QFA. 

Corollary 3 . 1 . The class of languages recognizable by a QFA is not closed under 
union. 

This answers a question of Brodsky and Pippenger [BP 99] . 

As L 2 n ^3 =0 then also Li = L 2 AL 3 . So the class of languages recognizable 
by QFA is not closed under symmetric difference. From this and from the fact 
that this class is closed under complement, it follows: 

Corollary 3 . 2 . The class of languages recognizable by a QFA is not closed under 
any binary boolean operation where both arguments are significant. 

Instead of using the general construction of Theorem 2.2, we can also use 
a construction specific to languages L 2 and L 3 . This gives simpler QFAs and 
achieves a better probability of correct answer. (Theorem 2.2 gives QFAs for 
L 2 and L 3 with the probability of correct answer 3/5. Our construction below 
achieves the probability of correct answer 2/3.) 




Fig. 5 . Automaton G2 



Fig. 6. Automaton G3 



Theorem 3 . 2 . There are two languages L2 and L3 which are recognizable by a 
QFA with probability | but the union of them Li = L2 IJ L3 is not recognizable 
with a QFA (with any probability 1/2 + e, e > 0/. 

This is the best possible, as shown by the following theorem. 



Theorem 3 . 3 . If 2 languages Li and L2 are recognizable by a QFA with proba- 
bilities Pi and p2 and ^ ^ < 3 then L = Li[J L2 is also recognizable by QFA 



with probability 



2piP2 

P1+P2+P1P2 ■ 



Corollary 3 . 3 . If 2 languages Li and L2 are recognizable by a QFA with prob- 
abilities pi and p2 and pi > 2/3 and p2 > 2/3, then L = Li IJL2 is recognizable 
by QFA with probability ps > 1/2. 
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4 More “Forbidden” Constructions 

If we allow the ’’two cycles in a row” construction, Theorem 2.2 is not longer true. 
More and more complicated ’’forbidden fragments” that imply non-recognizabi- 
lity by a QFA are possible. 

Theorem 4.1. Let L he a language and M he its minimal automaton. If M 
contains a fragment of the form shown in Figure 7 where a,b, c, d, e, f, g, h,i G E* 
are words and qo, Qa, Qb, Qc, Qad, Qae, Qbd, Qbf, Qce, Qcf are states of M and 

1. If M reads x G {a, b, c} in the state go, its state changes to Qx- 

2. If M reads x G {a,b,c} in the state Qx, its state again becomes Qx- 

3. If M reads any string consisting of a, h and c in the state qx (x G {a,b,c}J, 
it moves to a state from which it can return to the same qx by reading some 
(possibly, different) string consisting of a, h and c. 

4- If M reads y G {d, e, /} in the state qx (x G {a, b, c}), it moves to qxy^ 

5. If M reads y G {d,e,f} in the state qxy, its state again becomes qxy 

6. If M reads any string consisting of d, e and f in the state qxy it moves 
to a state from which it can return to the same state qxy by reading some 
(possibly, different) string consisting of d, e and f. 

1. Reading h in the state qad, i in the state qbe and g in the state qcf lead to 
accepting states. Reading g in qae, h in qbf and i in qcd lead to rejecting 
states. 

then L is not recognizable by a QFA. 

The existence of the “forbidden construction” of Theorem 4.1 does not imply 
the existence of any of previously shown “forbidden constructions” . To show 
this, consider the alphabet E = {a, 6, c, d, e, /, g, h, i} and languages of the form 
Lx,y,z = a:(a|6|c)*?/(ii|e|/)*z where x G {a, b, c}, y G {d, e, /}, z G {g, h, i}. Let L 
be the union of languages Lx^y,z corresponding to black squares in Figure 8. 

Theorem 4.2. The minimal automaton of L does not contain the “forbidden 
constructions” of Theorems 1.2 and 2.1. 

However, one can easily see that the minimal automaton of L contains the 
“forbidden construction” of Theorem 4.1. (Just take go to be the starting state 
and make a, &,..., i of Theorem 4.1 equal to corresponding letters in the alpha- 
bet E.) This means that the existence of “forbidden construction” of Theorem 
4.1 does not imply the existence of previous “forbidden constructions”. 

Theorem 4.1 can be generalized to any number of levels (cycles following one 
another) and any number of branchings at one level as long as every arc from 
one vertex to other is traversed the same number of times in paths leading to 
accepting states and in paths leading to rejecting states. 

A general “forbidden construction” is as follows. 



® Note: we do not have this constraint (and the next two constraints) for pairs x = 
0 ‘,y = f,x = b, y = e and x = c, y = d for which the state q^y is not defined. 
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Fig. 7. Conditions of Theorem 4.1 



a: 




9 h i 




9 h i 




Fig. 8. The Language L 



Level 1 of a construction consists of a state qi and some words an, oi2, . . . . 

Level 2 consists of the states <721, <722, • • • where the automaton goes if it reads 
one of words of Level 1 in a state in Level 1. We require that, if the automaton 
starts in one of states of Level 2 and reads any string consisting of words of Level 

1 it can return to the same state reading some string consisting of these words. 

Level 2 also has some words 021, 022, 

Level 3 consists of the states <731, <732, . . . where the automaton goes if it reads 
one of words of Level 2 in a state in Level 2. We require that, if the automaton 
starts in one of states of Level 3 and reads any string consisting of words of Level 

2 it can return to the same state reading some string consisting of these words. 

Again, Level 3 also has some words 031, 032, 
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Level n consists of the states g„i, g„ 2 , • • • where the automaton goes if it 
reads one of words of Level n — 1 in a state in Level n — 1. 

Let us denote all different words in this construction as a\, G 2 , as, . . . , am- 

For a word Oi and a level j we construct sets of states Bij and Dij. A state 
q in level j + 1 belongs to Bij if the word Oi belongs to level j and M moves to 
q after reading ai in some state in level j. A state belongs to Dij if this state 
belongs to the Level n and it is reachable from Bij . 

Theorem 4.3. Assume that the minimal automaton M of a language L con- 
tains the ‘forbidden construction” of the general form described above and, in 
this construction, for each Dij the number of accepting states is equal to the 
number of rejecting states. Then, L cannot be recognized by a QFA. 

Theorems 2.1 and 4.1 are special cases of this theorem (with 3 and 4 levels, 
respectively) . 
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Abstract. It is known that recognition of regular languages by finite 
monoids can be generalized to context-free languages and hnite 
groupoids, which are finite sets closed under a binary operation. A loop 
is a groupoid with a neutral element and in which each element has a 
left and a right inverse. It has been shown that finite loops recognize 
exactly those regular languages that are open in the group topology. In 
this paper, we study the class of aperiodic loops, which are those loops 
that contain no nontrivial group. We show that this class is stable under 
various dehnitions, and we prove some closure properties. We also prove 
that aperiodic loops recognize only star-free open languages and give 
some examples. Finally, we show that the wreath product principle can 
be applied to groupoids, and we use it to prove a decomposition theorem 
for recognizers of regular open languages. 



1 Introduction 

A monoid M is a set closed under a binary associative operation and that con- 
tains a two-sided identity element. The free monoid over an alphabet A is de- 
noted by A* and is defined as the set of all finite sequences of letters in A, with 
concatenation being the operation and the empty sequence e playing the role of 
the identity. 

The cornerstone of the algebraic theory of machines is the observation that 
any finite monoid can be seen as a finite state machine and that recognition 
of regular languages reduces to multiplication in a monoid. More formally, let 
L C A* be a language, let M be a monoid, and let </> : A* — > M be a morphism. 
We say that M recognizes L if there exists F C M such that L = 

Kleene’s Theorem can then be stated as follows: a language is regular if and 
only if it is recognized by a finite monoid (e.g. see [16,23]). 
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A. Ferreira and H. Reichel (Eds.): STAGS 2001, LNCS 2010, pp. 87-98, 2001. 
@ Springer- Verlag Berlin Heidelberg 2001 




Martin Beaudry, Frangois Lemieux, and Denis Therien 



Using this definition, it becomes possible to classify regular languages accord- 
ing to the algebraic properties of the monoids that recognize them. Two famous 
examples are the following. 

A language L C A* is piecewise- testable if it is in the Boolean closure of 
languages of the form A* a\A* ■ ■ ■ A* anA* where n > 0 and G A, for all 
0 < i < n. A monoid is J-trivial iff any two distinct elements generate distinct 
two-sided ideals. A theorem of Simon ([28]) says that a language is piecewise- 
testable if and only if it is recognized by a J-trivial monoid. 

A language is star- free if it is in the closure of {{a} : a G A} U {e} under 
Boolean operations and concatenation. A monoid M is aperiodic if there exists 
an integer n such that for every a G M we have a" = a"+^. Then, a deep result 
due to Schiitzenberger ([27]) states that star-free languages are precisely those 
languages that can be recognized by an aperiodic monoid. 

This algebraic approach can also be used in the context of parallel com- 
plexity. By replacing homomorphisms by polynomial-length programs, it is pos- 
sible to characterize algebraically well-known classes of Boolean circuits such as 
NC^ , AC° and ACC^ . Important open questions about the computing power of 
these models of parallel computation can be phrazed in purely algebraic terms 
(see [4,5]). 

Hence, the study of regular languages has become a rich theory with many 
deep results and applications, and it remains an active field that continues to 
challenge researchers. This makes more striking the observation that no such 
theory exists for context-free languages. Nevertheless, this topic has been the 
subjet of recent investigations (e.g. [18,21,10,13,19,7,8,20]) that we briefly de- 
scribe here. 

A groupoid G is a set with a binary operation that can be non-associative. 
All groupoids considered in this paper are finite. Groupoids can be used as 
language recognizers as follows. For any w € G* , denote with G{w) the set of all 
elements g G G such that w can be evaluated to g using some parenthesization. 
Let L C A* be a language, let G be a groupoid, and let <p : A* ^ G* he a, 
morphism induced by a function (j) : A G. We say that G recognizes L if there 
exists a subset F C G such that for any w G A* we have that w G L \i and 
only if G{(j){w)) n F yf 0. When G is associative, this definition corresponds to 
the recognition by monoid defined above. Our interest in groupoids comes from 
the fact that a language is context-free if and only if it is recognized by a finite 
groupoid (e.g. see [18,10]). 

In the absence of a general theory of groupoids, a classification of the context- 
free languages based on the algebraic properties of the groupoids that recognize 
them is still a major research project. This approach could also have implications 
in complexity theory since context-free languages are related to the class SAC^ of 
polynomial size, logarithmic depth circuits constructed with NOT gates, binary 
AND gates and OR gates of unbounded fan-in. Hence, a better understanding of 
the algebraic nature of these languages could be an important tool in the study 
of small circuit classes. 

A subclass of groupoids that has been studied intensively (see [2,3,11,12,14], 
[22]) is the family of finite loops. A loop is a groupoid that possesses a unique 
identity element and that satisfies the cancellation law. every element has a right 
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and a left inverse (not necessarily identical) . We observe that the multiplication 
table of a finite loop is such that every row and column is a permutation. Hence, 
a group is an associative loop but not all loops are associative (the smallest 
example is the loop of Section 4) . 

In [13], it has been shown that any language recognized by a finite loop is 
regular. Despite this lack of power of finite loops, their investigation is essential 
in order to better understand the non-associativity of general groupoids and the 
languages they recognize. This result has been refined in [9], where an exact 
characterization of the languages recognized by finite loops is given. A language 
is recognized by a loop if and only if it is an open regular language in the group 
topology (see [25]). A simpler way to express this result is that a language L C A* 
is recognized by a loop if and only if it is a finite union of languages of the form 
LottiLi ■ ■ ■ Ln-iOuLn, where n > 0, Oi € A, and each Li is a group language. 

One of the main tool used in [9] is an operation, called the wreath product, 
that takes a loop and a group to get another loop. The wreath product plays an 
important role in the algebraic theory of machines, as it is the algebraic formal- 
ization of the notion of series connection. A fundamental decomposition result 
states that any monoid can be decomposed as a wreath product of components, 
each of which is either a group or an aperiodic monoid (see [16]). In this paper 
we will prove a similar decomposition theorem for loops. 

Our presentation is divided as follows. In Section 2, we define aperiodic and 
group- free loops, and we prove that these two classes of loops are equivalent. In 
Section 3, we show that aperiodic loops recognize only star-free languages. Some 
closure properties of aperiodic loops are demonstrated in Section 4. In Section 5, 
we introduce the notion of algebraic transduction which is a sequential function 
performed by a pushdown machine. We then relate this kind of transduction with 
the wreath product of loops, generalizing the wreath product principle of monoids 
(e.g. see [23]). Using this relation, we prove in Section 6 that aperiodic loops 
can recognize languages of the form A*a\A* ■ ■ ■ A*a„A*. As a consequence, any 
regular open language is the finite union of languages Si that can be recognized 
by the wreath product of a loop Bi and a group Gi, where Bi is a loop that 
recognizes a language of the form A* a\A* ■ ■ ■ A* OnA* . 



2 Aperiodic and Group-Free Loops 

A loop B is said to divide another loop A if i? is a morphic image of a subloop 
of A. A loop B is said to be group- free if there exists no nontrivial group that 
divides B. Given a word w € B*, we use B{w) to denote the set of all elements 
that can be obtained from the evaluation of w in B using any parenthesization. 
When there is no ambiguity, we simply write w instead of B(w). As an example, 
if b € B, we write b € w (or w ^ b) whenever there exists a way to evaluate w 
in order to get b. When B is associative, then the set ic is a singleton and we 
simply write w = b. 

An element 6 of a loop B is said to be aperiodic if there exists an integer n 
such that for any m> n, the set is equal to A loop is aperiodic if all its 
elements are aperiodic. We observe that for any element g of any loop B there 
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exists n > 0 such that g" contains the identity. It follows that if g is aperiodic 
then for a sufficiently large m we have = (g) where {g) is the subloop of B 
generated by g. 

The above definitions can be extended to any groupoid G in a straightforward 
way. In particular, when G is a monoid, it is known that G is group-free if and 
only if it is aperiodic [27]. We will prove in this section that this is also true for 
loops. 

Given any loop B we define the set I{B) = {g £ B : 3no > 0, Vn > no, e e 
g^} where e is the identity of B. The following lemma shows that I{B) corre- 
sponds precisely to the set of all aperiodic elements of B. 

Lemma 1. Let g be an element of a loop B. If g € I{B) then there exists n\ 
such that for all n>n\ we have g" = {g). 

Proof. Let e be the identity of B. Since g € I{B), there exists no such that 
e e (/" for all n > no- Let ko be such that for any b € (g) there exists k < ko 
such that b G g^. Hence, for all b G {g) and for all m > no -I- fco we have b G g™. 

Proposition 1. If a loop B is aperiodic then it is group-free. 

Proof. Let B = I{B) and suppose that a nontrivial group G divides B. Then, 
there exists a subloop S C B and a surjective morphism (f : S ^ G. Since G 
is nontrivial, there must exist g G G such that g" yf for all n > 0. This 
means that 4>~^{g"') n = 0 for all n > 0. Hence 4>~^{g) fl I{B) = 0 

contradicting the aperiodicity of B. 

To prove the other direction, we need the following classical lemma from 
number theory. 

Lemma 2. Letp and q be two coprime integers. There exists an integer no such 
that for all n > no there exist a,b > 1 such that n = ap-\- bq. 

Lemma 3. Let b G B et let n,m > 1 be two coprime integers such that e G 6" 
and e G 6™, where e is the identity of B. Then, there exists ko such that for all 
k > ko we have e G b^ . 

Proof. Using Lemma 2 we have 6* = 9 e when k is large enough. 

Lemma 4. Let B be a loop with identity e and let g G B. Suppose there exists 
an integer n > 2 such that for any integer m, if e G g^ then n\ m. Then, B is 
divided by a nontrivial group. 

Proof. Define for each 0 < i < n the set S'* = {6 G i? : 3c > 0, 6 G Hence, 

S'o is a subloop of B since if a G and b G then ab G , 

Also, if 0 < t < j < n then Si n Sj = 0. Indeed, if a G S'i n Sj then there 
exists b G B such that b G So C\ Sk where k = i -\- n — j. Let d > 0 be such that 
e G bg‘^. Since b G So, n must divide d and since & G S'fc we must have e G Sk+d- 
But, this contradicts the condition on B given in the statement of the lemma. 

Hence, S'o is a normal subloop of B and B/ So — Zn (see [22] for a discussion 
on normal subloops). 
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Proposition 2. If B is a group-free loop then it is also aperiodic. 

Proof. Lemmas 3 and 4 imply that B = I{B). The conclusion follows from 
Lemma 1. 

We have thus proved 

Theorem 1. A loop is group- free if and only if it is aperiodic. 



3 Aperiodic Loops Recognize only Star-Free Languages 



Lemma 5 (Schiitzenberger). A language is star-free if and only if it can he 
recognized by an aperiodic monoid. 



Lemma 6. A regular language L C A* is star-free is and only if there exists 
no > 0 such that for all x,y,w € A* and for all n > no we have xw^y G L iff 
xw'^+^y G L. 



Lemma 7. Let B be an aperiodic loop. There exists po > 0 such that for all 
x,y,w G B* and all n> po we have xw^y = xw'^^^y. 

Proof. Let p be such that \xwPy\ is maximal, where [S'! denote the cardinality 
of the set S. Let no > 0 such that e G for all n > no, where e is the identity 
of B. Hence, xw^y C xw'^^^y for all n > ng and since \xw^y\ is maximal, 
xw'^y = xw^'^'^y = xw^'^'^'^^y for all n > ng. Thus, it is sufficient to take 
Po =p + ng. 



As a consequence of the above two lemmas we have 
Proposition 3. Aperiodic loops can recognize only star-free languages. 

We have shown the following theorem. 

Theorem 2. Let B he a finite loop. The following conditions are equivalent. 

1. B is group- free 

2. B is aperiodic 

3. B = I{B) 

4 . B recognizes only star- free languages 

5. There exists ng > 0 such that ab"'c = ab^^^c for all a,b,c G B and all 
n > ng. 
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4 Other Properties of Aperiodic Loops 

We begin this section with two examples of aperiodic (group-free) loops. In the 
sequel we will refer to these loops as and B-j. 





0 1 2 3 4 5 6 


0 1234 


1 2 0 4 3 6 5 


1 2 043 


2 0 3 5 6 4 1 


2 3 40 1 


3 4 5 6 1 2 0 


34120 


4 3 6 1 5 0 2 


40 3 1 2 


5 6 4 2 0 1 3 




6 5 1 0 2 3 4 



The reader will verify (hopefully using some software!) that all elements in 
B 5 and By are aperiodic. Loop By is commutative but no aperiodic loop of even 
order can be commutative. Actually, we can prove something stronger. 

Let a be an element of a loop B that is different from the identity e. The left 
inverse and the right inverse of a are defined as the unique solution to the 
equation a^a = aa^ = e. 

Lemma 8. If B is an aperiodic loop of even order, then there exists a € B such 
that ^ . In particular, there exists no commutative aperiodic loop of even 

order. 

Proof. Suppose that B has even order and that for all a G B. Then, for 

each a G B there exists a unique b G B such that ab = ba = e. Since |B| is even 
and ee = e there must exist an element c G B which is different from e and such 
that cc = e. This implies that B is not group-free. 

Lemma 9. Let B be a loop that does not recognize the language OR = A*aA* . 
Then, for every b G B different from the identity e, there exists k >2 such that 
= {e}. 

Proof. Let b G B he different from the identity e. If |&*| = I for all i > 0 then 
b generates a group and b^ = e for some k > 2. Otherwise, let j > 3 be the 
smallest integer such that \V\ > 2. If for every i < j we have P yf {e}, then 
B can recognize the language OR = A*aA*. It suffices to map the letter a to 
the element b and all other letters in A to the identity e. The accepting set is 
B-{e}. 



Corollary 1. If B is an aperiodic loop of even order then B recognizes the 
language OR = A*aA* . 

Proof. Suppose that B cannot recognize OR. By the above lemma, we have that 
for all 6 G B different from the identity e, there exists fc > 3 such that b^ = {e}. 
Observe that we cannot have k = 2 since B is aperiodic. By the cancellation 
law, there exists c G B such that 6*“^ = {c}. Thus, we must have be = cb = e 
contradicting Lemma 8 




Star-Free Open Languages and Aperiodic Loops 



93 



Indeed, loop By is an example of an aperiodic loop such that for all 

elements a. However, this loop as all aperiodic loops known by the authors can 
recognize the language OR (we will see in Section 6 the importance of this fact) . 
Actually, the only languages known to be recognizable by a finite aperiodic loop 
are unions of 

— languages of the form A*{aa -I- bh}A*, 

— languages of the form A*a\A* ■ ■ ■ A*a„A*, 

— cofinite languages 

We show in Section 6 how to recognize languages of the second form. A 
construction for cofinite languages is given in the full version of this paper. The 
following aperiodic loop can recognize languages of the first form. 

0123456789 
1306578924 
2045697831 
3568914270 
4657831092 
5974082316 
6892743105 
74 8 1 2 0 9 6 5 3 
8239160547 
9710325468 

It suffices to use the morphism induced by a ^ 1 and 6 — > 2. The accepting 
set is {3, 4, 5, 6, 7, 8, 9}. One can verify that all words not in the language evaluate 
only to 0,1 or 2. All words of length 7 evaluate to at least 4 distinct values (the 
only exceptions are words of the form 22121212 . . . that evaluate to only three 
distinct values). Finally all other words can be checked exhaustively. 

The fact that languages recognized by aperiodic loops are closed under union 
is a direct consequence of the following proposition that is proved in the full 
version of this paper. 

Proposition 4. The direct product of two (aperiodic) loops is a (aperiodic) loop. 

The above result can be generalized to the wreath product. The wreath prod- 
uct of two loops is defined as in the associative case. Let S and T be two loops 
with identity eg and ct, respectively. The product SoT is the set S'^xT together 
with the binary operation 

{fl{u),tl){f2{u),t2) = {fl{u)f2{tiu),tlt2) 



Proposition 5. The wreath product of two (aperiodic) loops is a (aperiodic) 
loop. 
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5 The Wreath Product Principle 

We consider in this section a finite groupoid as an algebraic structure that can be 
used to recognize both word languages and tree languages. The free groupoid over 
an alphabet A is denoted with and defined as the set of all well parenthesized 
expressions over A. A tree over an alphabet A is an element of A^*'> . We define the 
function Yield : ^ A* as follows. For all a € A U {e} we have Yield(a) = a 

and if G A^*) with t = (^1^2) then Yield(f) = Yield(ti)Yield(t2). We say 

that a word rc G A* is the yield of a tree t G A^*) whenever w = Yield(t). 

We recursively define the set of (right) combs in A^*) as follows. Each element 
of A U {e} is a comb; if a G A and c is a comb then (ac) is a comb; nothing else 
is a comb. Hence, a comb in A^*) is simply a word in A* that is parenthesized 
from right to left. 

One can define an (associative) operation on combs such that the product of 
two combs Cl and C2 gives another comb c = ci • C2 which is the concatenation 
of the yields of ci and C2 parenthesized from right to left. 

We define a stack automaton as a tuple M = (A, G, F, <p), where 

— A is an alphabet 

— G is a finite groupoid whose elements form the set of states 

— F C G is the set of final states 

— (() : A — > G is a function 

We admit only two type of transitions. 

1 . Read the next input character a G A 

Push the current state h G G 

Go to the next state <p{a) 

2 . Pop g £ G 

Go to the state gh, where h is the current state. 

Initially, the stack is empty. The machine accepts its input if and only if there 
is no more character to be read, the stack is empty, and the current state is in 
F. 

Remark. Recognition with a stack automaton is essentially identical to 
recognition with a finite groupoid. In particular, the input of a stack automaton 
can be a word in A* or a tree in A^A , jn the first case, the machine is seen as a 
non-deterministic device while it is deterministic in the second case. 

Stack automata will be used to define a special kind of transductions that we 
call algebraic. Let M = (A, G, F, (f) be a stack automaton. For each a £ G, let 
L{a) be the function L{a) : x ^ ax and let A4 l(G) be the monoid generated by 
the set {L{g) : g £ G}. An algebraic transducer is a tuple T = (M, B, t) where 

— M = (A, G, F, (f>) is a stack automaton 

— i? is an alphabet 

— T : Ml{G) xA^B^A is the output function 

Such a transducer behaves according to the two following rules. 
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1 . After each transition of type 1 , the transducer writes the expression t(Ji{x) , a) 
G B^*\ where cc G G* is the content of the stack and h : G* ^ M.l{G) is 
the morphism induced by h{a) = L{a). 

2 . After each transition of type 2 , the transducer replaces the two last written 
expressions, ei and 62, with the expression (6162). 

We will be particularly interested by the case where 

- A C G 

- B = Ml{G) X A 

- T is the identity function. 

We then say that T is the natural transducer of G over the alphabet A. 

We also denote with T : ^ B'^*\ the function computed by the algebraic 

transducer T. Such a function is called algebraic transduction. When A C G, the 
natural transduction is defined in an evident way from the natural transducer. 
One easily verifies the following proposition. 

Proposition 6. Any algebraic transduction is a homomorphic image of some 
natural transduction. 

Let G and H be two groupoids et let A and i? = Ad l (G) x A be two alphabets. 
Let M = {A,H,F,(f>) be a stack automaton and T = (M,B,t) be the natural 
transducer of M. Moreover, let F C G x H and h : {B x A)^ ^ G x F[, a, 
morphism. Recall that we also use T to denote the natural transduction T : 
^(♦) 

We define the tree language Lh^F Q A^*^ as the set of all trees u such that 
h{T{u),u) G F. We can now define the class of languages T{G,H) = {Lh^F '■ 
F G G X H and h : {B x A)*^*^ ^ G x iL is a morphism}. 

The wreath product principle is stated with the following two theorems. 

Theorem 3 . If L G T{G, H) then L is recognized by G o H. 

Theorem 4 . If L C A*^*^ is recognized by GoH then L is in the positive Boolean 
closure of T{G, H) (more precisely, L is a finite union of finite intersections of 
languages in T{G, H) ). 

We close this section with an observation that will be useful in the next 
section. Let F : A^*^ ^ B^*^ be an algebraic function and let t\,t2,H G A^*). 
Since algebraic transductions preserve the parenthesization, the transduction of 
(^1^2) is (S1S2) for some si,S2 G 5 ^*^. Similarly, F{{tit^)) = (S3S4) for some 
53,54 G B '^*\ Moreover, we must have 5i = S3 since an algebraic transducer 
would output the transduction of t\ before reading the rest of the input. In this 
sense, algebraic transductions are sequential functions. 

The above observation remains true in a slightly more complex situation. 

Lemma 10 . Let F : A^*^ ^ B^*"> be an algebraic transduction. Let t G A^*^ be 
a tree and c= c\ ■ C2 be a comb in A^*). Let F{{tc)) = (sid) and F{{{tci)c2)) = 
{{s2e)g) for some s\,S2,d,e,g G B'^*'> . Then si = S2 and d,e,g are combs. 
Moreover, there exists two combs di,d2 in B^*'> such that d = d\ ■ d2 and such 
that the yields of e and di are identical. 
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Proof. Given a tree t G a stack automaton behaves as follows. It ignores 
open parentheses, it uses transitions of type 1 on elements of A and it uses 
transitions of type 2 on close parentheses. On another hand, a natural transducer 
writes a symbol in B after a transition of type 1 and modifies the parenthesisation 
of the current output after a transition of type 2. Let k be the number of leaves 
in (tci). 

The above description of the behaviour of a natural transducer implies that 
after doing k transitions of type 1, there is no difference between the output 
produced on inputs (fc) and ((tei)c 2 ). This is because in both cases, up to that 
point, the transducer has seen the same input t (recall that open parentheses are 
ignored) . The result follows from the fact that future transitions of the transducer 
cannot modified the yield of this part of the output. 

6 Application of the Wreath Product 

We show in this section that if L C A* is a language recognized by a finite loop 
P and if a G A then the language LaA* is recognized hy B o P, where B is any 
group-free loop that possesses a certain property. As a consequence, we have 
that languages of the form A*a\A* ■ ■ ■ A*a„A* are recognized by a group-free 
loop. We conclude with a decomposition theorem for finite loops. 

Let L be a loop and g G L. Denote with the comb with n leaves in 
Let {g) be the subloop generated by g and let C{g) be the subset of {g) 
defined by C{g) = : n > 0}. Let C be the class of loops for which there 

exist x,g G L such that g G C(x) but gP yf C'(x) (recall that gP is the right 
inverse of g) . 

As we will see, loops in C have some nice properties. In particular, we have 
the following lemma. 

Lemma 11. Any loop in C can recognize A* a A* 

Proof. Let i? be a loop that cannot recognize A*aA* . Hence, by Lemma 9, for 
dl\ g G B, there exists k > 0 such that g^ = {e}, where e is the identity of B. 
This means that C{g) = : 0 < n < k}. And since gg^ = g^g for all i < k, 

we have that the right (and left) inverse of any element in C{g) is also in C{g). 
Hence B ^ C. 

Let P be a loop. For each a G P, define the permutations L{a) \ x ^ xa and 
R{a) :i-^- xa. We define the multiplication group M.{P) as the group generated 
by all L{a) and R{a), a G P. 

Proposition 7. Let L C A* be a language recognized by a loop P and let G be 
a loop in C. Then, LaA* is recognized by Go P. 

Proof. Let 1 G G be an element such that H ^ C'(l) and denote with 0 the 
identity of G. Assume that the accepting set of P is F. Let B = A4{P) x A and 
let h : B^*'> — > G be the morphism induced by h{D,x) = 1 whenever P(l) G F 
and X = a, and h{D, x) = 0 otherwise. Let T be the natural transduction of P. 
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By Proposition 3, it suffices to show that a word w G A* is in L if and only if 
there exists a tree t G such that w is the yield of t and such that h{T{t)) 
evaluates in G to an element different from 0. 

Let w G LaA* and let w = uav where u G L such that u is as small as 
possible (this means that u is not in LaA*). We can parenthesize u into a tree 
s G that evaluates to an element in F, and we can parenthesize v into a 
comb c G A^*'i to get a tree t = (s(ac)) G A^*^ with yield w. 

Let q = (z(xy)) G B^*'> be the transduction of t. It is clear that x = [L{u),a] G 
B is such that h{x) = 1. Moreover, for all leaves g of z we have h{g) = 0, since 
u is not in LaA*. If h(q) ^ 0 then q is accepted by G and everything is fine. 

Consider the case where h{q) = 0. This can happen only if the number k of 
leaves in the comb h{xy) that are labeled with 1 is congruent to zero modulo 
the cardinality of C(l). Since h{x) = 1, we have that fc > 0. Let d G (7(1) be 
such that ^ Cid)- There exists i < k such that = d and we can write 
xy = a • P, where h{a) = d. 

By Lemma 10, we can parenthesize w as w' = ((sa')c') such that the trans- 
duction of w' is {{zx')y'), where x' = a and y' is some comb in B^*\ This means 
that h{zx')y' is different from 0 since h{xy') = d and d^ ^ G{g). 

Loop B^ of Section 4 is an example of a loop in C. Not all aperiodic loops 
are in C, however. This is the case of loop i ?7 of Section 4. The question as to 
whether any aperiodic loop can be used in Proposition 7 is open. 

Given a loop B and an integer n > 0, we define the loop recursively on 
n as follows. B^ = B and, for i > 1, B^ = B o By Proposition 5, if B is 

group-free then so is i?". 

Corollary 2. Let A he a finite alphabet. For any loop B G C, for any n > 0 and 
for any ai, . . . ,Un G A, the language A*aiA* ■ ■ ■ A* an A* is recognized by B". 

Theorem 5. A regular language is open if and only if it is recognized by a finite 
direct product of loops of the form o G, where B is a group-free loop and G 
is a group. 

Proof. In [9] , it is proved that a regular language L is open if and only if it is the 
finite union of languages that can be recognized by a loop of the form PoG where 
(7 is a group and P any loop that can recognize the language A* a\A* ■ ■ ■ A* OnA* 
for a large enough n. The theorem follows from Corollary 2. 
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Abstract. We prove a lower bound of |n^ — 3n for the multiplicative 
complexity of n x n-matrix multiplication over arbitrary fields. More 
general, we show that for any finite dimensional semisimple algebra A 
with unity, the multiplicative complexity of the multiplication in A is 
bounded from below by | dim A — 3(ni + • • • + nt) if the decomposition 
of A = Ai X • • • X At into simple algebras At = contains only 

noncommutative factors, that is, the division algebra Dt is noncommu- 
tative or n-T > 2. 



1 Introduction 

One of the leading problems in algebraic complexity theory is the determination 
of good (lower as well as upper) bounds for the multiplicative complexity of 
n X n-matrix multiplication. Loosely speaking, the problem is the following: given 
n X n-matrices X = (Xij) and Y = (Yij) with indeterminates Xij and Yij 
over some ground field k, how many essential multiplications and divisions are 
needed to compute the entries of the product XY? Here, “essential” means that 
additions, subtractions, and scalar multiplications are free of costs. According to 
Strassen [20], we may reformulate the problem over infinite fields as follows: the 
multiplicative complexity of n x n-matrix multiplication is the smallest number £ 
of products 

Px — j ) * vxi^Xi jjYi j^ 

with linear forms u\ and v\ in the Xij and Yij such that each entry of XY is 
contained in the linear span of pi, . . . ,pi, i.e., 

n 

Xi,„Y^,j G linjpi, ...,pe} for 1 < i,j < n. 

In other words, we may restrict our attention to computations that contain only 
“normalized” multiplications and no divisions. (Since we are considering lower 
bounds in this work, the above restriction to infinite fields does not impose any 
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problems: lower bounds over a field K also hold over any subfield k C K.) For 
a modern and comprehensive introduction to algebraic complexity theory, we 
refer to [9]. 

A related quantity is the bilinear complexity (or rank) of n x n-matrix mul- 
tiplication. Here the products p\ = u\(Xij) ■ v\(Yij) are bilinear products, that 
is, products of linear forms u\ in the Xij and linear forms v\ in the Yij. (Note 
that the entries of XY are bilinear forms.) The concept of bilinear complexity 
has been utilized with great success in the design of asymptotically fast matrix 
multiplication algorithms, see for example [19,2,18,22,10]. Obviously, the multi- 
plicative complexity is a lower bound for the bilinear complexity and it is not 
hard to see that twice the multiplicative complexity is an upper bound for the 
bilinear complexity (see e.g. [9, Eq. 14.8]). Therefore, we usually want to have 
upper bounds for the bilinear complexity and lower bounds for the multiplicative 
complexity. While the difference between multiplicative and bilinear complexity 
seems to be minor at a first glance, it is much harder to cope with the multi- 
plicative complexity when dealing with lower bounds. One reason among others 
is the fact that the bilinear complexity of a tensor of a bilinear map (see be- 
low for a definition) is invariant under permutations whereas the multiplicative 
complexity might not, see also [9, Chap. 14.2] for a further discussion. 

1.1 Model of Computation 

Of course, we can define multiplicative complexity not only for the multiplication 
of n X n-matrices (which is a bilinear map x ^ fc"^") but also 

for arbitrary bilinear maps. When considering lower bounds, it is often more 
convenient to use a coordinate-free definition of multiplicative complexity, see 
e.g. [9, Chap. 14.1]. In the following, if E is a vector space, let V* denote the 
dual space of V, i.e., the vector space of all linear forms on V. 

Definition 1. Let k he a field, U , V, and W finite dimensional vector spaces 
over k, and : U x V ^ W a bilinear map. 

1. A sequence (3 = , fi,gi,wi) with f\,g\ G {U xV)* and w\ eW 

is called a quadratic computation for ([) over k of length i if 

i 

(j)(u, t;) = ^ fx{u, v)gx{u, v)wx for alluGU.v G V. 

2. The length of a shortest quadratic computation for is called the multiplica- 
tive complexity of and is denoted by C{4>). 

3. If A is a finite dimensional associative k-algebra with unity, then the mul- 
tiplicative complexity of A is defined as the multiplicative complexity of the 
multiplication map of A, which is a bilinear map A x A ^ A, and is denoted 
by C(A). 

If we want to emphasize the underlying ground field k, we will sometimes write 
Ck{4>) and Cfc(A) instead of C{(j)) and C{A), respectively. Using the language 
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from above, the multiplicative complexity of n x n-matrix multiplication is de- 
noted by 

If we require that f\ G U* and g\ G V* in the above Definition 1, we 
get bilinear computations and bilinear complexity (also called rank). We denote 
the bilinear complexity of a bilinear map (p by R{4>) or Rk{<P) ^tnd the bilinear 
complexity of an associative algebra A by R{A) or Rk{A). For any bilinear map 4>, 
we have 

C(P) < i?(0) < 2 • C(P ) . (1) 

Except for trivial cases, the second inequality is always strict, see [14]. 



1.2 Previous Results 

In 1978, Brocket! and Dobkin [7] proved the bound > 2n^ — 1 and in the 

same year, Lafon and Winograd [15] extended this result to the multiplicative 
complexity. Three years later, Alder and Strassen [1] unified most of the lower 
bounds known at that time, including the last one, in a single theorem: for any 
finite dimensional associative fc-algebra A 

C{A) > 2dim A — t , (2) 

where t is the number of maximal twosided ideals in A. Recently, some progress 
has been made in the case of matrix multiplication, namely > 2n‘^+n—3 

(see [4]). Bshouty [8] obtained the bound i?GF(2)(G'F(2)"^”) > |n^ — o(n^) for 
the special case k = GF{2) using methods from coding theory. He claims that 
this bound also holds for the multiplicative complexity (over GF{2)) but does 
not give a proof. Finally, in [3] we proved the lower bound 

i?(/c”^”) > fn^-Sn. (3) 



for arbitrary fields k. 

1.3 New Results 

As our first main result, we show that (3) also holds for the multiplicative com- 
plexity. The following theorem is proven in Section 4. 

Theorem 1. For any field k, C'(fc"^") > |n^ — 3n. 

One of the main ingredients of its proof is Lemma 5 which is essentially a novel 
combination of a result by Ja’Ja’ [14] on the relation between multiplicative and 
bilinear complexity, Strassen 3-slice tensor technique [21], and the substitution 
method [16]. We prove Lemma 5 in Section 3. Before doing so, we introduce the 
so-called “tensorial notion” in the next Section 2 and present a (well-known) 
alternative characterization of multiplicative complexity, which is more suited 
for our purposes. 

The bound of Theorem 1 is a special case of the following lower bound (which 
we show in Section 5). 




102 



Markus Blaser 



Theorem 2. Let A = A\ x ■ ■ ■ x At he a semisimple algebra over an arbitrary 
field k with Aj- = for all t, where Dr is a k-division algebra. Assume 

that each factor Ar is noncommutative, that is, Ur >2 or Dr is noncommutative. 
Moreover, let n = ni + ■ ■ ■ + nt- Then C{A) > | dim A — 3n. 

Our new bound of Theorem 2 is the first lower bound over arbitrary fields for 
the multiplicative complexity of a semisimple algebra — in particular of the im- 
portant algebra — significantly above the Alder-Strassen bound (2). In [5] 

(see also the forthcoming [6]), we obtained the same bound for the easier case of 
the bilinear complexity. While in this case, our bounds can also be extended to 
arbitrary finite dimensional algebras A provided that the (semisimple) quotient 
algebra A/ rad A satisfies the assumptions of Theorem 2, we provide an example 
in Section 6 which shows that our methods cannot yield this generalization for 
the multiplicative complexity (at least without any extra considerations). This 
again gives evidence for the intricate nature of multiplicative complexity. 

2 Characterizations of Multiplicative Complexity 

In the previous section, we have introduced the multiplicative complexity of a 
bilinear map in terms of computations. A second useful characterization of mul- 
tiplicative complexity is the so-called “tensorial notion” (see [9, Chap. 14.4] for 
the bilinear complexity) . With a bilinear map (/>:[/ x C —> W, we may associate 
a coordinate tensor (or tensor for short) which is basically a “three-dimensional 
matrix”: we fix bases u\, . . . , Um of [/, vi, . . . ,v„ of V, and w\, . . . ,Wp of W. 
There are unique scalars C,;y,p G k such that 

p 

(j){Up,,Vu) = E tp.,u,pWp for all 1 < /X < m, 1 < < n. (4) 

p=i 

Then t = {tp^u,p) G is the tensor of </> (with respect to the chosen bases). 

On the other hand, any given tensor also defines a bilinear map after choosing 
bases. We define the multiplicative complexity of the tensor t by C{f) := C{(j)). 
In the same way, the bilinear complexity of t is R{t) := R{(j)). (This is in both 
cases well-defined, since the multiplicative resp. bilinear complexity is robust 
with respect to invertible linear transformations, i.e., with respect to changes of 
bases.) 

If (j) is the multiplication in an algebra A, then we may instantiate the above 
three bases with one and the same basis. In this case, the tensor consists of the 
structural constants (see [11] for a definition) of the algebra A (with respect to 
the chosen basis). 

With each tensor t = {tp,^u,p)^ we may associate three sets of matrices, the 
slices of t. The matrices Qp = {tp.,i^,p)i<i^<n,i<p<p G with 1 < p < m 

are called the 1-slices of t, the matrices S„ = {tp^v,p)i<p.<m,i<p<p G with 

I < V < n the 2-slices, and finally Tp = {tp^u,p)i<p<m,i<v<n G with 

1 < P < P are called the 3-slices of t. When dealing with bilinear complexity, it 
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makes no difference which of the three sets of slices we consider. In the case of 
multiplicative complexity, however, the 3-slices play a distinguished role. (This is 
one reason why proving lower bounds for the multiplicative complexity is hard.) 

Lemma 1. Let k he a field and t he a tensor with 3-slices Ti,. . . ,Tp G 
Then C{t) < £ if and only if there are (column) vectors u\,v\ G for 

I < X < £ such that with P\ := u\ ■ vj G 



( 0 Ti 

W 0 



0 Tp 

Tj 0 



e lin{Pi + 



..,Pi + p7}. 



( 5 ) 



Here, denotes the transpose of a matrix T and lin{. . .} denotes the linear 
span. A proof of this lemma is straight forward. (One possibility is to follow the 
lines of the proof of Theorem 3.2 in [14].) The rank one matrices P\ correspond 
to the products of a quadratic computation. By transposing, we identify the 
product xy of two indeterminates with yx. 

If Ti, . . . ,Tp are the 3-slices of a tensor t, we will occasionally also write 
C(Ti, . . . , Tp) instead of C{t) and R(Ti, . . . , Tp) instead of R{t). By multiplying 
(5) with 



A 0 
0 



and 



0 



from the left and right, respectively, it follows from Lemma 1 that if A G f^mxm 
and Y G are invertible matrices, then 



C(Ti, . ..,Tp) = C(X-Ti-r,...,X-Tp-Y). (6) 

This multiplication of the slices with A and Y corresponds to a change of the 
bases ui, , Um and vi, . . . ,v„ in (4) . 



3 Lower Bounds 

In the present section, we prove our main lemma (Lemma 5). Its proof is done by 
a combination of the so-called substitution method (first used for proving lower 
bounds in algebraic complexity theory by Pan [16]), Strassen’s lower bound for 
the (border) rank of a 3-slice tensor [21], and a result by Ja’Ja’ [14] which relates 
multiplicative and bilinear complexity in a more sophisticated way than (1). 

We first state the results of Strassen and Ja’Ja’. For two elements b,c of an 
associative algebra, let [b,c] := be — cb denote their Lie product (or commutator). 
For a matrix M G , let rk M denote its (usual) rank. We denote the identity 

matrix of k^^^ by In- 

Lemma 2 (Strassen). Let t be a tensor with 3-slices In,B,C G k^^^ over 
some field k. Then R{t) > A -|- | rk[B, C\. 

For a proof, see [21, Thm. 4.1] or [9, Thm. 19.12]. Strassen actually proves 
the above lemma for the so-called border rank of t (see e.g. [9, Chap. 15.4] for a 
definition) which is a lower bound for the bilinear complexity. 
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Lemma 3 (Ja’Ja’). Let t he a tensor with 3-slices Ti, . . . ,Tp G over 

some field k. Then 



C{t) > 






Tp 0 

0 Tj 



For a proof, see [14, Thm. 3.4]. 

Combining these two lemmata, we obtain a lower bound for the multiplicative 
complexity of a 3-slice tensor. 

Lemma 4. Let k he a field. Let t he a tensor with 3-slices In,B,C G k^^^ . 
Then C{t) > N -\- irk[B,C']. 

Proof. By the previous Lemmata 3 and 2 



C{t) > \R{hN, 




> i(2iV-k irk 





( 7 ) 



Since B^ = -[B,CY , 



rk 



(B 0 

B^ 



fa 0 . (BC-CB 0 



2 rk[B,C]. 



Thus, the right-hand side of (7) equals -I- 5 rk[B, Cj. □ 

We now come to the proof of our main lemma. We combine the substitu- 
tion method (here manifesting itself in the Steinitz exchange) with the previous 
Lemma 4. 



Lemma 5. Let t he a tensor with linearly independent 3-slices Ti,...,Tp G 
I^NxN some field k. Assume there are integers s and q such that for each 
basis Ui, . . . ,Up of linjTi , . . . ,Tp} there are indices i\, . . . ,is and ji, . . . , jg with 
the following properties: the linear span of Ui .^ , • ■ • , Ui^ contains an invertible 
matrix E and the linear span of Uj ^ , • . • , Uj^ contains matrices B and C with 
rk[B, C\ = N . Then C{t) > p — s — q-\- ^N. 

Proof. Let i := C{t). By Lemma 1, there are 2N x 2A^-matrices P\,. . . ,Pi of 
rank one and Sp G lin{Pi, . . . , Pi} such that 

^'") = -5'p -k S'J forl<p<p. (8) 

The matrices Si, . . . , Sp are linearly independent, since otherwise we would ob- 
tain a linear dependence of the 3-slices T\, . . . ,Tp from (8), a contradiction. 
We now exploit the Steinitz exchange in a rather explicit way: write Sp = 
Yf\=i with scalars ^p^\ G k for 1 < p < p. Let X = (^p,a) € k^^^. 
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The matrix X has full rank p, since the Sp are linearly independent. By per- 
muting the Pa) we may assume that the matrix X' consisting of the first p 
columns of X is invertible. Let Y' be its inverse. We may augment Y' to a 
matrix Y = (pp^\) G such that 

p i 

ilp,iSi = Pp+ X! for 1 < p < p. (9) 

i—1 A— p+1 

=: Mp 



The above defined matrices Mi , , Mp are linearly independent and their linear 
span equals linjS'i, . . . , Sp}. By virtue of (8), 

Glm{Mi + Mj,...,Mp + Mj} for 1 < p < p. (10) 

Let Lp be the N x Ai-matrix defined by 




{ 0 Lp 

UJ 0 



= Mp + Mj . 



{Mp + Mj is of the above form by (8) and the linear independence of Ti, . . . , Tp.) 
By (10), linjLi, . . . , Lp} equals linjTi, . . . , Tp}. By the assumption of the lemma, 
there are indices ii, . . . ,ig such that linjLjj , • . • , Li^ } contains an invertible ma- 
trix E. Exploiting (6), we may replace Lp with E~^Lp for 1 < p < p. Thereafter, 
In G linjLii, . . . ,Li„}. Again by assumption, there are indices ji, . . . ,jq such 
that linjLjj, . . . , Lj^} contains matrices B and C with rk[P, C] = N. 

By (9), it follows that 



( 0 In 
0 



0 



0 C 

0 



Glin{P,j +P^ 



T 

n ’ ' 



,Pz.+P, 



Pn + P-^ 



ji ’ ' 



Pj,+P 



Jq ’ 



Pp+1 + ■ ,Pt + P/ }. 



Thus, C{In,B, C) <£ — p+s + q yielding C{t) > p — s — q + C{In,B, C). By 
Lemma 4, C{In, B,C) > |iV. □ 



4 Matrix Multiplication 

As the first and most important example, we apply Lemma 5 to the algebra 
Our aim is to utilize the following two lemmata which are proven in [3, Sect. 4]. 

Lemma 6. Let k be an infinite field and let V he a subspace of k^^"^ that con- 
tains an invertible matrix. Then for any basis vi, . . . ,Vd of V there are s < n 
and indices i\, . . . ,is such that already the linear span ofu^,..., Vi^ contains an 
invertible matrix. 
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Lemma 7. Let k be an infinite field, let n > 2, and let vi, ... ,Vp be a basis 
of , where p = . Then there are q < 2n, indices ji,...,jq, and b,c G 

...,Vj^} such that [b,c] is invertible. 

To exploit the above two lemmata, we have to relate the structure of the 
algebra to the structure of the 3-slices of the coordinate tensor of 

For the moment, consider an arbitrary associative algebra A of dimension p. 
For an element x G A, let £x and denote the vector space endomorphisms 
defined hy y xy and y ^ yx for all y G A, respectively. Let ai, ... ,Qp be a 
basis of A and t be the corresponding coordinate tensor. ^From (4) (where each 
of the three bases is instantiated with oi, . . . , Up) it is clear that the pth 1-slice 
of t is the matrix of the left multiplication £ap (with respect to oi, . . . , Op). Thus, 
the homomorphism that maps Up to the pth 1-slice of t for each p is a faithful 
representation of A. Therefore, the subalgebra of k^^^ generated by the 1-slices 
of t is isomorphic to A. In the same way, the subalgebra of k^^P generated by the 
2-slices is isomorphic to the opposite algebra A° of A (see [11] for a definition). 
The question how the 3-slices of t are related to the structure of A is a more 
subtle question. 

In the case of the algebra we are lucky. It is well known that the 

structure of the coordinate tensor t of is invariant under permutations, see 
for example [9, Eq. 14.21]. It follows that if Qi, . . . , Qp denote the 1-slices of t 
and Ti, . . . ,Tp denote the 3-slices of t (where p = n^), then there are invertible 
matrices X G k^^P and Y G kP^P (even permutation matrices) such that 

linjQi, ...,Qp} = lin{X ■ Ti ■ Y, . . . , X ■ Tp ■ Y} . 

By (6), we may replace Tphy X -Tp-Y for 1 < p < p. Thereafter, the subalgebra 
of kP^P generated by Ti, . . . , Tp is isomorphic to the algebra By the above 
Lemmata 6 and 7, Ti, ... ,Tp fulfill the assumptions of Lemma 5 with s = n 
and q = 2n. (The restriction that k is infinite imposes no problem here, since 
Ck{k'^^^) > Ck{K'^^^) for any extension field K D k.) Now Lemma 5 yields 
C(Ti, . . . , Tp) >n^ — n — 2n+ |n^. This completes the proof of Theorem 1. 

5 Further Applications: Semisimple Algebras 

In the present section, we generalize the results of the preceding section to 
semisimple algebras. 

As seen above, we have to relate the 3-slices of the coordinate tensor of an 
algebra with the structure of that algebra. We start with the case of a simple 
fc-algebra A. By Wedderburn’s Structure Theorem (see [11]), A is isomorphic to 
an algebra ^ D ® for some positive integer n and some /c-di vision 

algebra D. Let p := dim A = • dim D. Let oi, . . . , Op be a basis of A and let 

a^, ... ,a* denote its dual basis. Let £*. denote the dual of the left multiplication 
with Oi, that is, the linear map A* A* defined by 6 i— > 6 o . Applying a* to 
(4) yields 
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hence 



p 

{a*p) = fo'' all 1 < At, p < p- 



( 11 ) 



De Groote [12, Prop. 1.1] shows that for any simple algebra A there is a vector 
space isomorphism S : A* fulfilling 



= S o Px o S ^ for all x G A. 



Substituting this into (11), we obtain 

p 

S~^{a*p) • ^ for all 1 < p, p < p. 

i/=i 

Thus, the pth 3-slice of the tensor of A with respect to the basis ai, ... ,Up is 
the matrix of the homomorphism £s~^{a*) ■ A ^ A with respect to the two 
bases ai,...,Op and S~^{al), . . . , S~^{a*). Hence, there are invertible matri- 
ces X,Y G such that the subalgebra of generated by the matrices 
X ■ Ti ■ Y, ..., X ■ Tp ■ Y is isomorphic to A. 

Next, we consider the case of a semisimple algebra. By Wedderburn’s Struc- 
ture Theorem, any semisimple algebra A is isomorphic to a direct product of 
simple algebras A\X ■■■ x At where each At = for some fc-division al- 

gebra Dt. If we choose a basis with respect to this decomposition of A, then the 
corresponding coordinate tensor is a direct sum of the tensors oi Ai, . . . , At. By 
applying the above considerations for the simple case separately to each At, we 
conclude that there are invertible matrices X,Y G k^^^ such that the subalgebra 
of kP^P spanned hy X ■ Ti ■ Y, ..., X ■ Tp ■ Y is isomorphic to A. 

The following analogue of Lemma 6 and Lemma 7 is proven in [5] : 



Lemma 8. Let A = AiX- ■ - x At he a semisimple algebra over an infinite field k 
with At = for all t, where Dt is a k-division algebra. Assume that 

each factor At is noncommutative, that is, Ut > 2 or Dt is noncommutative. 
Moreover, let n = n\ + ■■■ + nt and vi, . . . ,Vp be a basis of A. 

1. There are s <n and indices i\,...,is such that linjuij , . . . ,Vifi\ contains an 
invertible element. 

2. There are q < 2n, indices ji,-..,jq, and b,c G lin{vj.,, . . . ,vj^} such that 
[b, c] is invertible. 

By the above Lemma 8, the 3-slices T\, . . . ,Tp fulfill the assertion of Lemma 5 
with s = n and q = 2n where n = n\ + • • ■ + nt. (Again, the restriction that k is 
infinite does not impose any problems here, we can switch over from k to k{x) 
with some extra indeterminate x. This does not have any relevant impact on the 
structure of A.) Altogether, this proves Theorem 2. 
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6 A Limiting Example 

While for the bilinear complexity, our bounds can also be extended to arbitrary 
finite dimensional algebras A provided that the (semisimple) quotient algebra 
A/ xdA A fulfill the assumptions of Theorem 2 (see [5] and the forthcoming [6]), 
we here construct an example that satisfies these assumptions but for which our 
method fails in the case of multiplicative complexity. Of course, this does not 
mean that our method cannot be applied to arbitrary associative algebras, we 
just have to examine the 3-slices of the algebra explicitly. 

Let Xi, . . . , Xn be indeterminates over some field k. Furthermore, let I denote 
the ideal generated by all monomials of total degree two. Consider the algebra 
A = k[Xi , . . . , Xn]/I. We have W ■ Xj = 0 in A for all i, j. With respect to the 
basis l,Xi, , Xn, the tensor Ia of A looks as follows: 

1 2 3 ••• n+l\ 

2 
3 

Vn+1 J 

Above, a p in position (p., v) means that the pth 3-slice has the entry one in 
position (p, Unspecified entries are zero. 

The algebra A is commutative, so A/ rad A does not fulfill the assumptions 
of Theorem 2. (In fact, A is of minimal rank.) Instead, consider A' = D ^ A 
for some noncommutative central division algebra D. We have A'/ rad A' = D, 
thus A'/ rad A' satisfies the assumptions of Theorem 2. However, any matrix 
in the linear span of the 3-slices of the tensor of A' (which we obtain from tA 
by substituting each one in Ia by the tensor of D) has rank at most 2dimZ). 
Consequently, the Lie product of any two such matrices has rank at most 4 dim D. 
Therefore, if n is large, we are not able to obtain the additional ^dimA' that 
we achieved in the bound of Theorem 2. 
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Abstract. We prove new results on evasiveness of monotone graph 
properties by extending the techniques of Kahn, Saks and Sturtevant [4]. 
For the property of containing a subgraph isomorphic to a fixed graph, 
and a fairly large class of related n-vertex graph properties, we show 
evasiveness for an arithmetic progression of values of n. This implies 
a — 0{n) lower bound on the decision tree complexity of these 
properties. 

We prove that properties that are preserved under taking graph minors 
are evasive for all sufficiently large n. This greatly generalizes the eva- 
siveness result for planarity [1]. We prove a similar result for bipartite 
subgraph containment. 

Keywords: Decision Tree Complexity, Monotone Graph Properties, 
Evasiveness, Graph Property Testing. 



1 Introduction 

Suppose we have an input graph G and are required to decide whether or not 
it has a certain (isomorphism invariant) property P. The graph is given by an 
oracle which answers queries of the form “is (x, y) an edge of G?” A decision 
tree algorithm for P is a strategy that specifies a sequence of such queries to the 
oracle, where each query may depend upon the outcomes of the previous ones, 
terminating when sufficient information about G has been obtained to decide 
whether or not P holds for G. The cost of such a decision tree algorithm is the 
worst case number of queries that it makes. The decision tree complexity of P is 
the minimum cost of any decision tree algorithm for P. 

Since an n-vertex graph has ^n(n— 1) vertex pairs each of which could either 
be an edge or not, it is clear that any property of n-vertex graphs has complexity 
at most ^n(n — 1). If a property happens to have complexity exactly ^n(n — 1) 
then it is said to be evasive} 

A property of n-vertex graphs is said to be monotone if, starting with a graph 
which has the property, the addition of edges does not destroy the property. It 
is said to be nontrivial if there exists an n-vertex graph which has the property 
and one which does not. Connectedness, non-planarity, non-/c-colorability and 

* This work was supported in part by NSF Grant CCR-96-23768, NSF Grant GCR- 
98-20855 and ARO Grant DAAH04-96-1-0181. 

^ Some authors call such properties “elusive” instead of evasive. 
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the property of containing a perfect matching are all examples of nontrivial 
monotone properties (for sufficiently large n). Rosenberg [7] attributes to Karp 
the following conjecture which, remarkably, remains open even today. 

Karp Conjecture: Every nontrivial monotone graph property is evasive. 

As a first step towards a resolution of this conjecture, Rivest and Vuillemin [6] 
proved that such properties have complexity at least n^/16, thereby settling 
the Aanderaa-Rosenberg conjecture [7] of an f2(n^) complexity lower bound. 
The next big advance was the work of Kahn, Saks and Sturtevant [4] where 
an interesting topological approach was used to prove that the Karp Conjecture 
holds whenever n is a prime power. Triesch [8] used this approach, together with 
a complicated construction, to prove the evasiveness of some special classes of 
properties. Similar topological ideas were used by Yao [9] to prove a related 
result: namely, that nontrivial monotone bipartite graph properties are always 
evasive. Prior to the work of Kahn et ah, adversarial strategies had been devised 
to prove the evasiveness of certain specific graph properties for all n in [5], [1] 
and [3] . These strategies worked for the properties of acyclicity, connectedness, 2- 
connectedness, planarity and simple variants on these. The most sophisticated of 
these adversarial strategies was one used by Bollobas [2] to prove the evasiveness 
of the property of containing a fc-clique, for any k, 2 < k < n. 

Let H be any fixed graph. For n-vertex graphs, let denote the property 
of containing H as & subgraph (not necessarily as an induced subgraph). From 
the work of Bollobas [2] we know that is evasive for all n in the special case 
when H is a complete graph. This raises the natural question: what can we say 
about general HI 

In this paper, we study this question and some related ones, extending, for 
the first time, the topological approach of [4] to a fairly general class of graph 
properties. For each of these properties, we draw stronger inferences than [4]. 
Our Main Theorem is stated below. 

Theorem 1.1 (Main Theorem). For any fixed graph H there exists an integer 
To with the following property. Suppose n = X)i=i 9°"’ where q is a prime power, 
Q Y \H\, each a* > 1 and r = 1 (mod cq). Then is evasive. 

In order to understand the significance and strength of this theorem, consider 
the following statements (proven in this paper) . Each of these statements follows 
either from the Main Theorem or from the techniques used in proving it. 

— For any graph H, there is an arithmetic progression such that Qlf is evasive 
for all n in the progression. Note that this is a much stronger inference than 
can be drawn by applying the results of [4]. 

— The decision tree complexity of is — 0(n). This bound does not 
follow from the results of [4]. 

— If the graph H is bipartite, then is evasive for large enough n. 

— Any n-vertex nontrivial graph property that is preserved under taking graph 
minors is evasive for large enough n. This includes lots of very natural graph 
properties such as embeddability on any surface, outerplanarity, linkless em- 
beddability in R^, the property of being a series-parallel graph, etc. Thus, 
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our result generalizes a result of Best et al. [1] who show that planarity is 
evasive. 

— Any monotone boolean combination of the properties for several different 
graphs H still satisfies our Main Theorem. Thus, for example, if Hi, H 2 and 
i ?3 are fixed graphs, then the property of containing as subgraph either H\ 
or both of H 2 and is still evasive for those n which satisfy the conditions 
of the Main Theorem. 

The remainder of the paper is organized as follows. In Section 2 we review 
the basics of the topological approach of Kahn et al.[4], establishing a connection 
between proving evasiveness of monotone properties and computing Euler char- 
acteristics of abstract complexes. Then in Section 3 we define a certain auxiliary 
property of graphs and prove a technical result (called the Main Lemma) about 
this property. This result is then used in Section 4 to prove our main theorem. 
In Section 5, we provide proofs for the additional results itemized above. We end 
with some concluding remarks in Section 6. 

Notations, Terminology, and Conventions: We call a graph trivial if it has no 
edges. Throughout this paper, all graphs will be assumed to be nontrivial, have no 
loops and no parallel edges. For a graph G, |G| will denote the number of vertices in 
G, also called the size of G, V{G) will denote its vertex set, E{G) its edge set, chr(G) 
its chromatic number and clq(G) the size of its largest clique. Graphs which occur 
as “input graphs” on which boolean properties are to be tested are assumed to be 
always vertex-labeled. All other graphs are assumed to be unlabeled, unless otherwise 
specified. When we speak of an “edge” in an input graph, we really mean an unordered 
vertex pair which may or may not be an edge. 



2 Review of the Topological Approach 

A property of m boolean variables xi,...,Xm is a function P : {0, 1}™ ^ 
{0,1}; we say that the m-tuple {xi, . . . ,Xm) has (or satisfies) property P if 
P{xi, ...,x — 1. We say that P is monotoTie if for every m tuple (^xi, . . . , ^m) 

that satisfies P, increasing any Xi from 0 to 1 yields an m-tuple that also satisfies 
P. We say that P is evasive if any decision tree algorithm for P has cost m. In 
our study of graph properties, the variables will be unordered pairs of vertices 
(i.e., potential edges of the graph) and P will be required to be invariant under 
relabelings of the graph. 

Let [m] denote the set {l,2,...,m| and consider the collection of subsets 
S C [m] with the following property: setting the variables indexed by S' to 1 
and those indexed by [m] \ S to 0 yields an m-tuple which does not satisfy 
P. Since P is monotone, this collection of sets is downward closed under set 
inclusion. Recall that such a downward closed collection of sets is called an 
abstract complex, and that the sets in this collection are called the faces of the 
complex. This observation motivates 

Definition 2.1. If P is monotone, then the abstract complex associated with P, 
denoted A(P), is defined as follows: 

A{P) = (S C [m] : li Xi = I i & S, then (xi, . . . , Xm) does not satisfy P}. 
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Associated with an abstract complex A is a topologically important number 
called its Euler characteristic which is denoted x(A) and is defined as follows: 

X(A)= ^ (1) 

0#FeA 

Kahn et al.[4] showed that non-evasiveness of P has topological consequences 
for A(P). The following theorem is implicit in their work: 

Theorem 2.2 (Kahn et al. [4]). If the monotone property P is not evasive, 
then x(A(P)) = 1. □ 

For our result, we shall need to use a stronger theorem which can also be 
found in [4] . Let A be an abstract complex defined on [m] and let P be a finite 
group which acts on the set [m], preserving the faces of A. The action partitions 
[m] into orbits, say Ai,. . . ,Ak- We use the action of P to define another abstract 
complex Ap on [fc] as follows: 

Ar = {SC [k] : U A, e A} (2) 

i&S 

Sometimes, as is the case with our work, it is not easy to say much about 
A(P) for a monotone property P. But it is possible to find some group P such 
that its action produces a more understandable abstract complex (A(P))p. The 
next theorem, the most important tool in [4], says that if P has certain rather 
restrictive properties, then non-evasiveness of P has a topological consequence 
on this new complex. 

Theorem 2.3 (Kahn et al. [4]). Suppose P has a normal subgroup Pi which 
is such that |Pi| is a prime power and the quotient group P/Pi is cyclic. Then 
if P is not evasive, we have x((A(P))p) = 1. □ 

An application of this result leads to the following theorem which is the main 
result of [4]. 

Theorem 2.4 (Kahn et al. [4]). Let Pn be a nontrivial monotone property of 
n-vertex graphs. If n is a prime power, then P„ is evasive. □ 

In order to derive Theorem 2.4 from Theorem 2.3, Kahn et al. construct a 
group which acts on the vertices of the input graph and thus, indirectly, on the 
edges. The number theoretic constraint on n is a consequence of the fact that 
this action depends crucially on being able to view the vertices of the graph as 
elements of a finite field. Our approach to proving evasiveness for more general 
n will be to devise a more sophisticated group action. Before we do so, we will 
need an auxiliary result which we shall establish in the next section. 

3 The Main Lemma 

Consider the following operation on a graph G. Let the vertices of G be colored, 
using all the colors in some set C, so that no two adjacent vertices get the same 
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color. Let G' be a graph with vertex set C where two distinct vertices ci,C 2 G C 
are adjacent iff the coloring assigns colors ci and C 2 to the end-points of some 
edge in G. We shall call G' a compression of graph G induced by coloring C. If 
there exists a C which induces a compression G' of G, we shall write G' <1 G. 

Definition 3.1. A family T of graphs is said to be closed under compression if 
for graphs G, H such that G G T and H <\G we have H G IF. 

Let T he & nonempty finite family of (nontrivial) graphs that is closed under 
compression. The property Pff that an input graph G on n vertices contains 
some member of IF as a subgraph is clearly nontrivial, for n large enough, and 
monotone. Let be the abstract complex associated with this property and 
let Xn = xi^n) be the Euler characteristic of this complex. 

The purpose of this section is to establish that for any such family IF, we 
have Xn 7 ^ 1 infinitely often. Let us set 

T = 2^ , where t is the smallest integer such that T > min \F\. (3) 

We shall prove 

Lemma 3.2 (Main Lemma). If n= 1 (mod T — 1) then Xn = 0 (mod 2). 

Since we only care about Xn mod 2, we can use the fact that addition and 
subtraction are equivalent mod 2 in (1) to get^ 

Xn = #{G : G is nontrivial and does not satisfy } (mod 2). (4) 



Consider n-vertex input graphs with vertices labeled with integers from 0 to 
n — 1. For n > T, let us define a group action on such graphs as follows. For 
a,b G {0, 1, 2, . . . , T — 1} and a odd, let permutation <fa,b be defined by mapping 
vertex i to vertex (ai + b) mod T for i G {0, 1, . . . , T — 1}. The other n — T 
vertices are left fixed. It is routine to check that the set of all these permutations 
forms a group 4> under composition, thereby defining a group action on the 
labeled vertices. This action induces an action on graphs in the obvious manner, 
thereby partitioning the set of all labeled n-vertex graphs into orbits. Since the 
order |4)| of the group is T^/2, a power of 2, each orbit has size a power of 2. 
Therefore, (4) can be modified to 



G is nontrivial, invariant under ) , , , 

■ r \ (mod 2). 

<I> and does not satisfy J 



(5) 



The action of 4) on the vertices also induces an action on edges (or rather, on 
unordered pairs of distinct vertices, each of which may or may not be an edge), 
not to be confused with the action on labeled graphs mentioned above. Therefore 
the set of edges amongst vertices 0, 1, . . . , T — 1 is partitioned into orbits. Since 
any odd integer is invertible mod T, we get 2* orbits Eq, Ei, . . . , E 2 t-i, where 

Ei = {{x,y) : 0 < X < y < T, y — x = 2*/c for some odd number fc} . (6) 

^ Note that we are counting not graphs, but labeled graphs. 
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Let G be an invariant graph. From now on, let us refer to the vertices 
1 as left vertices and the rest as right vertices. Let Gieft and Gright 
denote the subgraphs of G induced by the left and right vertices, respectively. By 
invariance of G, the set of right vertices adjacent to any left vertex is the same 
for each left vertex; let 7^(G) denote this set. Also, the set of edges A(Gieft) is the 
union of a certain number of the orbits Ei] let orb(G) denote this number. We 
shall show that whether or not G has the property Pff is completely determined 
once Gright, 'T^(G) and this number orb(G) are fixed; the specific Gieft does not 
matter. 

Lemma 3.3. For any invariant G, we have chr(Gieft) = clq(Gieft) = 

Proof. Let / C {0, 1, . . . , 2* — 1} be such that A(Gieft) = Uie/ then we have 
|/| = orb(G). Consider two vertices x,y of Gieft. If their binary representations 
agree on the bit positions indexed by I, then x — y = ±2* for some set 

r disjoint from I. By (6), this implies (x,y) ^ A(Gieft). Therefore, the vertices 
of Gieft can be partitioned into independent sets; thus chr(Gieft) < 

On the other hand, if x, y are such that the bits in positions outside I are all 
zero, then x — y = ±2* for some /" C I, which by (6) implies that 

(x,y) e E(Gieft). Therefore, Gieft has a clique of size 2l^l = The lemma 

follows. □ 



Lemma 3.4. Let Gi, G 2 be two invariant n-vertex labeled graphs with Gi^right = 
G 2 , right, P{Gi) = TZ{G 2 ) and orb(Gi) = orb(G 2 ). Then G\ has property Pff if 
and only if G 2 does. 

Proof. Suppose Gi has property P ^ ; we shall show that G 2 does too. Suppose 
Gi contains F G P as a, subgraph. We fix a particular occurrence of F within 
Gi so that we can talk about Fleft, Fright and TZ{F) := TZ{Gi) n V{F). 

Using Lemma 3.3 and the hypothesis, we obtain chr(Fieft) < chr(Gijeft) = 
clq(G 2 ,ieft)- Let h = chr(Fieft); from the above inequality it is clear that G 2 ,ieit 
contains Kh as a subgraph. Fix a particular occurrence of Kh and, starting with 
the graph Fright, connect each of the h left vertices in this occurrence to each 
vertex in F(F). Let F' be the resulting graph. Since F(F) C TZ{Gi) = TZ{G 2 ) 
and since Fright is a subgraph of Gi^right = G 2 , right, it follows that F' is a 
subgraph of G 2 . 

Consider the following coloring of the graph F: we use h colors for its left 
vertices and color each right vertex with a distinct color, never using any of these 
h colors. Let F" <l F be the compression of F induced by this coloring. It is not 
hard to see that F" is a subgraph of F' and therefore of G 2 . Since F is closed 
under compression, F" G F . Therefore G 2 has property P^ . □ 

Lemma 3.5. For n > T = 2'^ , we have Xn = Xn-T+i (mod 2). 

Proof. Let fc be a fixed integer with 0 < k < 2*. Recall that the group action 
induced on the edges creates 2* orbits. Consider the family of all n-vertex in- 
variant graphs G with F(G) and Gright fixed, and orb(G) = k. By Lemma 3.4, 
either all graphs in this family have property P^ or none of them does. The size 
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of this family is (^) which is even if fc yf 0 and fc yf 2*. If fc = 2‘, Gieft is a 
complete graph, and so G contains a clique of size T. From (3), we see that G 
has property P^. Therefore, by (5), 

II f ^ orb(G) = 0 and G is nontrivial, I 
^ ' invariant and doesn’t satisfy j ) 

Suppose we take such a G with orb(G) = 0 and collapse all its left vertices into 
one vertex which we connect to every vertex in TZ{G) and to no others, thereby 
yielding a graph G. This gives a bijection from n- vertex invariant graphs G with 
orb(G) = 0 to (n — T + l)-vertex graphs. 

It is clear that if G has property then G has property Now 

suppose G has property P^ and let F G F be a subgraph of G. Since orb(G) = 0, 
the vertices in Fleft form an independent set; thus we may color them all with one 
color and then color each remaining vertex of F with a distinct color different 
from the one just used. This coloring produces a compression F<lF which clearly 
is a subgraph of G. Since F is closed under compression, we have F G F and so 
G has property P^_q^_^_i- Thus our bijection respects the relevant property and 
this completes the proof. □ 

We now have all the pieces needed for the 

Proof, (of Lemma 3.2) Set n = T = 2^ . The only way for an n-vertex 
graph to have orb(G) = 0 is for it to have no edges. Using (7), this implies 
XT = 0 (mod 2). Invoking Lemma 3.5 completes the proof. □ 



4 Proof of the Main Theorem 

We now return to proving Theorem 1.1. According to the theorem’s hypotheses 

r 

n = (8) 

i=l 

where g is a prime power, q > |iJ|, each > 1 and r = 1 (mod tq). Our goal is 
to show that is evasive under these hypotheses for some choice of tq. 

The chief difficulty in applying the topological approach outlined in Section 2 
lies in having to construct a group action natural enough for the property under 
consideration and satisfying the stringent conditions on the underlying group 
necessary for Theorem 2.3 to apply. In this section we shall come up with a 
group action that allows us to “merge together” big clusters of vertices in our 
graph, in the process changing the property under consideration from to Pf^ 
for some family F of graphs, r being as in (8). 

We partition vertex set of our n-vertex graph into clusters Ui, . . . , W, with 
I U I = ?“* and identify vertices in Vi with elements of the finite field . Define 
a permutation group F on the vertices as follows: 



F = {{a,bi,b2, ...,br) : a G F*, bi G F,,^^} , 



(9) 
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where (a, 6i, 62, • ■ • , &r) denotes a permutation which sends x G Vi = to 

ax + bi G Vi- Let A = {(1, 61, • ■ • , br) : bi G IFgCi}. It is easy to check that A 
is a normal subgroup of V, |A| = a prime power, and -T/A — IFq, a 

cyclic group. Thus F satisfies the hypotheses of Theorem 2.3. 

As in Section 3, the action of F induces a group action on the edges and 
thus partitions the edges into orbits. Let A denote the set of these orbits and let 
A = A{Q^) denote the abstract complex associated with property . Define 
a complex Ap on A as in (2): 

Ar = {VCA : U A G A} . (10) 

Our intention is to show that the Euler characteristic x(Ap) yf 1. By Theo- 
rem 2.3, evasiveness of will follow. To this end, let us investigate what the 
faces of Ap look like. Call an edge an intracluster edge if both its end points lie 
in the same A for some i; call the edge an intercluster edge otherwise. 

Lemma 4.1. An orbit containing an intracluster edge is not contained in any 
face of Ap. 

Proof Let A G Ahe the orbit of the intracluster edge (u,v), u,v G A- Then 
A = {(oM + b, av + b) : b G IFqc,i , a G IF*}. Set w = v — u. Then (0, w) G A. 
Consider the set of vertices X = {wz : z G IF^}. For 0 ^ x G X we clearly 
have (0,cc) G A. Thus for any pair of distinct vertices xi,X 2 G X, we have 
(0, X 2 — xi) G A, whence (xi, X2) G A. So A contains all edges among vertices in 
X. Since |A| = q> \Fl\, the orbit A contains F[ as a subgraph. By definition, A 
cannot contain a face that includes A and so no face of Ap can contain A. □ 

It u G Vi,v G Vj,i < j, then the orbit of the intercluster edge (u,v) is the 
set Eij of all edges between A and Vj. Let £ = {Aj| i < 3 } F A. From the 
preceding lemma and (10) it is clear that 

Ap = {DQ£ ■. y A G A} . (11) 

Aev 

Let V be any subset of £. Then Gp = UAex> ^ is a graph on n vertices with 
no intracluster edges and such that if i yf j, the edges between A and Vj are 
either all present or all absent. Define a graph Gp on r vertices vi, ... ,Vr such 
that {vi,Vj) is an edge iff all edges between A, A present in Gp. 

Let Th denote the family of all graphs H such that H <l F[. It is easy to 
check that Tp is closed under compression (refer to Definition 3.1). The following 
lemma is simple to prove and connects this section with Section 3. 

Lemma 4.2. FI is a subgraph of Gp if and only if there is a H G Th such that 
H is a subgraph of Gp. In other words, Gp satisfies iff Gp satisfies . 

Proof. Suppose iJ is a subgraph of Gp. Consider the following coloring of Gp: 
all vertices in a cluster are colored the same and no two clusters use the same 
color. This is a valid coloring since each cluster of vertices is an independent 
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set. This coloring induces a coloring of H which in turn induces a compression 
H <\ H. Clearly this H is a, subgraph of Gx>- 

Now suppose H<iH isa subgraph of Gp- Consider the graph Hi with vertices 
in Li^^iVi formed by taking all edges in Eij whenever Vi and Vj are adjacent in 
H. Since each \Vi\ > q> \H\, it follows that H is a subgraph of Hi, and therefore 
of Gx>- C 

We are ready to prove our Main Theorem. 

Proof, (of Theorem 1.1) Suppose Qlf is not evasive. From Theorem 2.3, 
we have xi^r) = 1. If r = 1, there is only one cluster, so by Lemma 4.1 
we have Ap = { 0 }, whence xi^r) = 0, a contradiction. Therefore r > 1. 
Equation (11) and Lemma 4.2 imply that there is a one-to-one correspondence 
between faces of A/^ and nontrivial r- vertex graphs not satisfying property . 

Hence the abstract complex A/- is same as the abstract complex Aj^ defined 
in Section 3. It follows from the definition of compression that Th contains the 
complete graph on chr(iL) vertices and contains no smaller graph. Therefore, (3) 
yields t = l'lglgchr(iJ)] . Setting ro = 2^ — 1 and applying Lemma 3.2 we have 
x(AJ^) yf 1 and so xi^r) 1, a contradiction. □ 

5 Consequences and Extensions 

Our techniques enable us to prove certain results with “cleaner” statements than 
our Main Theorem 1.1; we prove four such results below. The first two are simple 
corollaries of Theorem 1.1 while the other two can be easily proved using the 
machinery of its proof. Finally, we present an interesting generalization of our 
Main Theorem. 

Theorem 5.1. For any graph H there exist infinitely many primes p with the 
following property: for all sufficiently large n divisible by p, the property Qff is 
evasive. 

Remark: Note that this establishes the evasiveness of for an arithmetic 
progression of values of n. 

Proof. Choose an integer t such that T = 2^ is at least \H\. By Dirichlet’s 
Theorem there exist infinitely many primes p such that p = 2 (mod T — 1). 
Fix one such p > T and pick any n > p^{T — 1) divisible by p. Now p — 1 is 
relatively prime to T — 1, therefore there is an integer x such that x{p — 1) = 
n/p — 1 (mod T — 1) and 0 < a; < T — 1. From the lower bound on n we have 
n/p — px > 0. Therefore we can write 



X njp—px 

n = ^ P 

i=l i=l 

which is an expression of n as a sum of powers of p. The number of summands 
in this expression is a: -I- n/p — px = 1 (mod T — 1). Since p>T> \H\, we can 
apply Theorem 1.1 to conclude that Q/f is evasive. □ 




Evasiveness of Subgraph Containment and Related Properties 119 



Corollary 5.2. For any graph H there exists a constant c = c{H) such that for 
all sufficiently large n, the decision tree complexity of is at least — cn. 

□ 



Theorem 5.3. If the graph H is bipartite, then is evasive for all sufficiently 
large n. 

Proof. Since chr(_ff) = 2, in the proof of Theorem 1.1, using the notation of that 
proof, we may take t = 0 which gives tq = 1. The condition r = 1 (mod tq) is 
now trivially satisfied. The condition on n becomes a simple requirement that 
n be divisible by a prime power q > \H\. But if n is sufficiently large then it 
clearly satisfies this condition. □ 



Theorem 5.4. Let A4 be an infinite minor-closed family of graphs that does not 
include all graphs. For n-vertex graphs, let be the property of being in M. 
Then is evasive for all sufficiently large n. 

Remark: Planarity was already known to be evasive. This result is a major 
generalization. 

Proof. Let H he & graph not in M. with minimum size, and let h = \H\. Then 
iL is a minor of both the complete graph Kh and the complete bipartite graph 
therefore no graph in A4 can contain either Kh or Kh^h as a subgraph. 
Suppose n is divisible by a prime power q > h, a, condition that always holds 
if n is sufficiently large. Following the argument of Section 4 we divide the labeled 
vertices of the candidate graph G into clusters of size q and consider the orbits 
of the edges created by the action of the group F described there. Let A be the 
abstract complex associated with the negation^ of R^ . An orbit containing an 
intracluster edge cannot be included in a face of Ap because its edges, if present, 
would create a Kq subgraph. An orbit containing an intercluster edge cannot be 
included either because its edges, if present, would create a Kq^q subgraph. Thus, 
Ap = {0} and so x(Ap) = 0 yf 1. By Theorem 2.3, the negation of R^ is evasive 
and therefore so is R^. □ 

The next theorem generalizes our Main Theorem and can be proved essen- 
tially using the same argument as that for the Main Theorem. 

Theorem 5.5. Let f : {0, 1}* — > {0, 1} be a nontrivial monotone boolean func- 
tion and let Hi, ... , Hu be arbitrary graphs. Define the composite property Q„ = 
f{Q^^,...,Q^^). Then there exists an integer rp with the following property. 
Suppose n = 9^* where q is a prime power, q > maxi<i<fc each 

ai > I and r = 1 (mod ro). Then Qn is evasive. □ 

Remark: This theorem shows, for instance, that properties like “G either con- 
tains Hi as a subgraph or else contains both H 2 and H 3 as subgraphs” are evasive 
for several values of n. This theorem has corollaries similar to Theorem 5.1 and 
Corollary 5.2. 

® Notice that the property R'^ is not monotone. However, its negation is monotone. 
Clearly a property is evasive if its negation is. 
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6 Concluding Remarks 

The major open question in the area of decision tree complexity of graph prop- 
erties is to settle the Karp Conjecture. The pioneering work of Kahn et al. [4] 
has given us a possible direction to follow in attempting to settle this conjec- 
ture. Since the publication of that work, our work is the first which extends their 
topological approach for a fairly general class of graph properties. 

An obvious open question raised by our work is: how far can one enlarge the 
set of values of n for which our results hold? We conjecture that in the notation 
of Section 3, we have Xn 1 for large enough n. If proved true, this conjecture 
would remove all number theoretic restrictions in the Main Theorem. 
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Abstract. We consider the problem of computing an optimal range assignment 
in a wireless network which allows a specified source station to perform a broad- 
cast operation. In particular, we consider this problem as a special case of the 
following more general combinatorial optimization problem, called Minimum 
Energy Consumption Broadcast Subgraph (in short, MECBS): Given a weighted 
directed graph and a specified source node, find a minimum cost range assign- 
ment to the nodes, whose corresponding transmission graph contains a spanning 
tree rooted at the source node. We first prove that MECBS is not approximable 
within a sub-logarithmic factor (unless P=NP). We then consider the restriction of 
MECBS to wireless networks and we prove several positive and negative results, 
depending on the geometric space dimension and on the distance-power gradient. 
The main result is a polynomial-time approximation algorithm for the NP-hard 
case in which both the dimension and the gradient are equal to 2: This algorithm 
can be generalized to the case in which the gradient is greater than or equal to the 
dimension. 



1 Introduction 

Wireless networking teehnology will play a key role in future eommunieations and the 
ehoiee of the network arehiteeture model will strongly impaet the effeetiveness of the 
applieations proposed for the mobile networks of the future. Broadly speaking, there 
are two major models for wireless networking: single-hop and multi-hop. The single- 
hop model [22], based on the eellular network model, provides one-hop wireless eon- 
neetivity between mobile hosts and static nodes known as base stations. This type of 
networks relies on a fixed backbone infrastructure that interconnects all base stations 
by high-speed wired links. On the other hand, the multi-hop model [15] requires nei- 
ther fixed, wired infrastructure nor predetermined interconnectivity. Ad hoc networking 
[12] is the most popular type of multi-hop wireless networks because of its simplicity: 
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Fig. 1. A Range Assignment and Its Corresponding Directed Transmission Graph. 



Indeed, an ad hoc wireless network is constituted by a homogeneous system of mobile 
stations connected by wireless links. In ad hoc networks, to every station is assigned a 
transmission range: The overall range assignment determines a transmission (directed) 
graph since one station s can transmit to another station t if and only if t is within the 
transmission range of s (see Fig. 1). 

The range transmission of a station depends, in turn, on the energy power supplied 
to the station: In particular, the power Pg required by a station s to correctly transmit 
data to another station t must satisfy the inequality 



Ps 

d{s, t)°‘ 



> 7 



( 1 ) 



where d(s, t) is the Euclidean distance between s and t, a > 1 is the distance-power 
gradient, and 7 > 1 is the transmission-quality parameter. In an ideal environment (i.e. 
in the empty space) it holds that a = 2 but it may vary from 1 to more than 6 depending 
on the environment conditions of the place the network is located (see [19]). The fun- 
damental problem underlying any phase of a dynamic resource allocation algorithm in 
ad-hoc wireless networks is the following: Find a transmission range assignment such 
that ( 1 ) the corresponding transmission graph satisfies a given properfy tt, and ( 2 ) the 
overall energy power required to deploy the assignment (according to Eq. 1) is mini- 
mized. 

A well-studied case of the above problem consists in choosing tt as follows: The 
transmission graph has to be strongly connected. In this case, it is known that: (a) the 
problem is not solvable in polynomial time (unless P=NP) [6,14], (b) it is possible 
to compute a range assignment which is at most twice the optimal one (that is, the 
problem is 2-approximable), for multi-dimensional wireless networks [14], (c) there 
exists a constant r > 1 such that the problem is not r-approximable (unless P=NP), 
for d-dimensional networks with d > 3 [ 6 ], and (d) the problem can be solved in 
polynomial time for one-dimensional networks [14]. Another analyzed case consists 
in choosing tt as follows: The diameter of the transmission graph has to be at most 
a fixed value h. In this case, while non-trivial negative results are not known, some 
tight bounds (depending on h) on the minimum energy power have been proved in [7], 
and an approximation algorithm for the one-dimensional case has been given in [5]. 
Other trade-offs between connectivity and energy consumption have been obtained in 
[16,21,24]. 
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In this paper we address the case in which tt is defined as follows: Given a source 
station s, the transmission graph has to contain a directed spanning tree rooted at s. 
This case has been posed as an open question by Ephremides in [10]: Its relevance is 
due to the fact that any transmission graph satisfying the above property allows the 
source station to perform a broadcast operation. Broadcast is a task initiated by the 
source station which transmits a message to all stations in the wireless network: This 
task constitutes a major part of real life multi-hop radio network [2,3]. 

The Optimization Problem. The broadcast range assignment problem described above 
is a special case of the following combinatorial optimization problem, called Mini- 
mum Energy Consumption Broadcast Subgraph (in short, MECBS). Given 
a weighted directed graph G = {V, E) with edge weight function w : E ^ 7^+, a 
range assignment for G is a function r : C — > TZ '^ : The transmission graph induced by 
G and r is defined as Gr = {V,E') where 

i?' = [J {(?;, u) : (v,u) € E A w{v, u) < ?"(?;)}. 
vev 

The MECBS problem is then defined as follows: Given a source node s G V, find a 
range assignment r for G such that Gr contains a spanning tree of G rooted at s and 
cost(r) = ^(■*^) is minimized. 

Let us consider, for any d > I and for any a > 1, the family of graphs N[J, called 
(d-dimensional) wireless networks, defined as follows: A complete (undirected) graph 
G belongs to if it can be embedded on a d-dimensional Euclidean space such that the 

weight of an edge is equal to the ath power of the Euclidian distance between the two 
endpoints of the edge itself The restriction of MECBS to graphs in is denoted by 
MECBS[N[|]: It is then clear that the previously described broadcast range assignment 
problem in the ideal 2-dimensional environment is MECBS[N 2 ]. 

Our Results. In this paper, we analyze the complexity of the Minimum Energy Con- 
sumption Broadcast Subgraph problem both in the general case and in the more 
realistic case in which the instances are wireless networks. In particular, we first prove 
that MECBS is not approximable within a sub-logarithmic factor, unless P=NP (see 
Sect. 2). Subsequently, we consider MECBS[N[j], for any d > 1 and for any a > I, 
and we prove the following results (see Sect. 3): 

- For any d > I, MECBS[N;^] is solvable in polynomial time'. This result is based 
on a simple observation. 

- MECBS[N[j] is not solvable in polynomial time (unless P=NP), for any d > 2 
and for any a > 1: This negative result uses the same arguments of [6]. 

- For any a > 2, MECBS[N 2 ] is approximable within a constant factor. This is 
the main result of the paper. A major positive aspect of the approximation algo- 
rithm lies on the fact that it is just based on the computation of a standard minimum 
spanning tree (shortly, MST). In a network with dynamic power control, the range 
assigned to the stations can be modified at any time: Our algorithm can thus take ad- 
vantage of all known techniques to dynamically maintain MSTs (see, for example. 
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[9,11,18]). MSTs have already been used in order to develop approximation algo- 
rithms for range assignment problems in wireless networks: However, we believe 
that the analysis of the performance of our algorithm (which is based on computa- 
tional geometry techniques) is rather interesting by itself 

Finally, in Sect. 4 we first observe that our approximation algorithm can be gener- 
alized in order to deal with MECBS[N]]], for any d> 2 and for any a > d: However, 
we also prove that the approximation ratio grows at least exponentially with respect to 
d. We then briefly consider the behavior of our approximation algorithm when applied 
to MECBS[N“] with a < d and we summarize some questions left open by this paper. 

Prerequisites. We assume the reader to be familiar with the basic concepts of compu- 
tational complexity theory (see, for example, [4,20]) and with the basic concepts of the 
theory of approximation algorithms (see, for example, [1]). 

2 The Complexity of MECBS 

In this section, we prove that the Minimum Energy Consumption Broadcast 
Subgraph problem is not approximable within a sub-logarithmic factor (unless P= 
NP). To this aim, we provide a reduction from MiN Set Cover to MECBS. Recall 
that Min Set Cover is defined as follows: given a collection C of subsets of a finite 
set S, find a minimum cardinality subset C' C C such that every element in S belongs 
to at least one member of C . It is known that, unless P=NP, MiN Set Cover is not 
approximable within clog n, for some c > 0, where n denotes the cardinality of S [23] 
(see, also, the list of optimization problems contained in [1]). 

Theorem 1. IfP ^ NP, then MECBS is not approximable within a sub-logarithmic 
factor. 

Proof (Sketch). Let x be an instance of the MiN Set Cover problem. In the full ver- 
sion of the paper, we show how to construct an instance y of MECBS such that there 
exists a feasible solution for x whose cardinality is equal to k if and only if there ex- 
ists a feasible solution for y whose cost is equal to fc -F 1. This clearly implies that 
if MECBS is approximable within a sub-logarithmic factor, then MlN Set COVER 
is approximable within a sub-logarithmic factor: The theorem hence follows from the 
non-approximability of Min Set Cover. □ 

One interesting feature of the reduction used in the previous proof is that it also allows 
us to show that MECBS is not approximable within a constant factor (unless P=NP), 
when the problem is restricted to undirected graphs. 

3 The Restriction to Wireless Networks 

In this section we analyze the complexity of the Minimum Energy Consumption 
Broadcast Subgraph problem restricted to wireless networks, that is, MECB S[N])] 
with d,a > 1. First of all, observe that if a = 1 (that is, the edge weights coincide with 
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the Euclidian distances), then the optimal range assignment is simply obtained by as- 
signing to s the distance from its farthest node and by assigning 0 to all other nodes. 
We then have that the following result holds. 

Theorem 2. For any d> then MECBS[N;^] is solvable in polynomial time. 

It is, instead, possible to prove the following result, whose proof is an adaptation of 
the one given in [6] to prove the NP-hardness of computing a minimum range assign- 
ment that guarantees the strong connectivity of the corresponding transmission graph 
(the proof will be given in the full version of the paper). 

Theorem 3. For any d > 2 and for any a > 1, MECBS[N^] is not solvable in poly- 
nomial time (unless P= NP). 

Because of the above negative result, it is reasonable to look for polynomial-time algo- 
rithms that compute approximate solutions for MECB S restricted to wireless networks. 
We now present and analyze an efficient approximation algorithm for MECB S[N 2 ], for 
any a > 2. In what follows, given a graph G £ N 2 , we denote by the graph ob- 
tained from G by setting the weight of each edge to the ath root of the weight of the 
corresponding edge in G: Hence, G N 2 , that is, there exists an embedding of 

Gl/a 

on the plane such that the Euclidean distance d{u, v) between two nodes u and v 
coincides with the weight of the edge {u, v) in G^/“. 

The Approximation Algorithm Mst-Alg. Given a graph G G N 2 and a 
specified source node s, the algorithm first computes a mst T of G (observe 
that this computation does not depend on the value of a). Subsequently, 
it makes T downward oriented by rooting it at s. Finally, the algorithm 
assigns to each vertex v the maximum among the weights of all edges of 
T outgoing from v. Clearly, the algorithm runs in polynomial time and 
computes a feasible solution. 

3.1 The Performance Analysis of the Approximation Algorithm 

The goal of this section is to prove that, for any instance x = {G = (V, E),w, s) of 
MECBS[N2] with a >2, the range assignment r computed by Mst-Alg satisfies the 
following inequality: 

cost(r) < 10“/^ • 2“opt(a;), (2) 

where opt(x) denotes the cost of an optimal range assignment. First notice that 

cost(r) < w{T), 

where, for any subgraph G' of G, w{G') denotes the sum of the weights of the edges in 
G'. Asa consequence of the above inequality, it now suffices to show that there exists 
a spanning subgraph G' of G such that w{G') < 10“/^ • 2“opt(x). Indeed, since the 
weight of T is bounded by the weight of G', we have that Eq. 2 holds. 

In order to prove the existence of G', we make use of the following theorem whose 
proof is given in Sect. 3.2. 
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Theorem 4. Let G G N 2 with a > 2 and let R be the diameter of that is, the 
maximum distance between two nodes in G^l°^ . Then, for any MST T ofG, 

w{T) < 

Let Topt be an optimal assignment for x. For any v € V, let 
S{v) = {u€V w{v,u) < ropt(tt)} 

and let T{v) be a MST of the subgraph of G indueed by S{v). From Theorem 4, it fol- 
lows that w{T{v)) < 10“/^ • 2“ropt(tt). Consider the spanning subgraph G' = (V, E') 
of G sueh that 

E' = y {e e F; : e G T(v)}. 
vev 

It then follows that 

w{G') < ^ w{T{v)) < 10“/^ • 2“ ^ Topt(r') = 10“/^ • 2“opt(x). 

vGV vGV 

We have thus proved the following result. 

Theorem 5. For any a >2, MECBS[N 2 ] is approximable within 10“^^ • 2“. 



3.2 Proof of Theorem 4 

Given a graph G G N 2 with a > 2, we identify the nodes of G with the points eor- 
responding to an embedding of G^/“ on the plane: Recall that the Euclidean distance 
d{u, v) between two points u, v coincides with the weight of the edge {u, v) in G^/“. 

Let us first consider the case a = 2 and let a = (ui,Vi) he the ith edge in T, 
for i = 1, . . . , |C| — 1 (any fixed ordering of the edges is fine). We denote by Di the 
diametral open circle of ti, that is, the open disk whose center Ci is on the midpoint of 
Ci and whose diameter is d{ui, Vi). From Lemma 6.2 of [17], it follows that Di contains 
no point from the set V — {ui, Vi}. The following lemma, instead, states that, for any 
two diametral circles, the center of one circle is not contained in the other circle. 

Lemma 1. For any i,jG{l,...,|C| — 1} with i ^ j, Ci is not contained in Dj. 

Proof. Suppose by contradiction that there exist two diametral circles Di and Dj such 
that Ci is contained in Dj. We will show that the longest edge between and Cj can be 
replaced by a strictly shorter one, still maintaining the connectivity of T: Since T is a 
MST the lemma will follow. Let us assume, without loss of generality, that d{uj,Vj) > 
d{ui,Vi). We first prove that 



m.ax{d{ui,Uj),d{vi,Vj)} <d{uj,Vj) (3) 

Let and Y~ be the half-planes determined by the line identified by Ci and Cj: 
Without loss of generality, we may assume that Vi and Vj (respectively, Ui and Uj) 
are both contained in F+ (respectively, Y~), as shown in Fig. 2. Assume also that 
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Fig. 2. The Proof of Lemma 1 . 



d{vi,Vj) > d{ui,Uj) (the other case can be proved in a similar way). Let x be the 
intersection point in Y~ between the two circumferences determined by Di and Dj 
(notice that, since Di and Dj are open disks, neither Di nor Dj contains x) and let Xi 
and Xj be the points diametrically opposite to x with respect to Ci and cj, respectively. 
Clearly, d{vi, vj) < d{xi,Xj). Eq. 3 easily follows from the following 

Fact 1. d{xi,Xj) < d{uj,Vj). 

Proof (of Fact 1). By definition, a (respectively, Cj) is the median of the 
segment xxi (respectively, xxj). Thus, the triangles A(xxiXj) and 
A(xCiCj) are similar. From the hypothesis that Ci G Dj, it follows that 
d{ci,Cj) < d{x, Cj). Thus, by similarity, it must hold that 

d(xi,Xj) < d(x,Xj) = d(uj,Vj) 



and the fact follows. □ 

As a consequence of Eq. 3, we can replace in T, Cj = (uj,Vj) by either (ui,Uj) or 
{vi,Vj) (the choice depends on the topology of T), thus obtaining a better spanning 
tree. □ 

We now use the above lemma in order to bound the number of diametral circles any 
point on the plane belongs to. 

Lemma 2. For any point p on the plane, p is contained in at most five diametral circles. 

Proof. Suppose by contradiction that there exist a point p covered by (at least) six 
diametral circles. Then, there must exist two circles D\ and Z ?2 such that their respective 
centers ci and C 2 form with p an angle (3 < tt/3 (see Fig. 3(a)). Let Ri and i ?2 denote 
the diameters of Di and D 2 , respectively. Since /3 < tt/3, we have that 

d(ci,C2) < max{d(ci,p), d(c2,_p)} < max{i?i, i?2} 

where the strict inequality is due to the fact that p G Di D D 2 and that both D\ and D 2 
are open disks. Hence, either c\ G D 2 or C 2 G D\, thus contradicting Lemma 1. □ 
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Fig. 3. The Proof of Lemma 2 



For any i with \ < i <\V\ — 1, let Di denote the smallest closed disk that contains 
Di. The last lemma of this section states that the union of all DiS is contained in a 
closed disk whose diameter is comparable to the diameter of 

Lemma 3. Let D = Ue eT Then, D is contained into the closed disk whose diam- 
eter is equal to s/2R and whose center coincides with the center of D. 

Proof. Consider any two points x and y within D. It is easy to see that the worst case 
corresponds to the case in which both x and y are on the boundary of D. Consider the 
closed disk whose diameter is equal to d{x, y) and whose center c' is on the midpoint 
of the segment xy, and let z be any point on its boundary (see Fig. 4). It holds that 
d{c, z) < \/2Rj2, where c is the center of D. Indeed, from the triangular inequality we 
have that 

d{c, z) < d{c, c') + d{c' , z) = d{c, c') + d{x, y) /2. 

Moreover, since the angle cc'y is equal to tt/2, 

dic,ff + d{c',yf = d{c,yf = Ry4. 



Thus, 

I R"^ — d(x, yY 

d{c,z) < y h d{x,y)/2. 

The right end of this equation reaches its maximum when d{x, y) = \f2Rj2, which 
implies d(c, z') < 'j2Rj2. Hence the lemma follows. □ 

We are now able to prove Theorem 4. In particular, we have to prove that 

|y|-i 

d{ui,ViY <IQR^, (4) 

i=l 

where {ui,Vi) is the ith edge in T, for i = 1, . . . , |H| — 1. Indeed, let Area (Di) denote 
the area of Di. It then holds that 

Wl-i ^ Wl-i 

d{ui,Vif = - X! Area(A)- 



( 5 ) 
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Fig. 4. The Proof of Lemma 3. 



By combining Lemma 3 and 2, we have that 



|y|-i 

Area(i9i) < 5 

i=l 



'V2R 



= 



(6) 



By combining Eq. 5 and 6 we obtain Eq. 4, which proves the lemma for a = 2. 
Finally, we consider the case ct > 2. By using simple computations, we get 



cos 



|y|-i |y|-i 

'^(^) = X! d{ui,Vi)°‘ = {d{ui,ViY) 

i=l i^l 

Vi-i 

/ 



a/2 



where the last inequality follows from Eq. 4. This completes the proof of Theorem 4. 



4 Further Results and Open Questions 

Algorithm Mst-Alg can be generalized to higher dimensions. In particular, it is pos- 
sible to prove the following result. 

Theorem 6 . There exists a function f : Af x TZ ^ TZ such that, for any d > 2 and for 
any a > d, MECBS[N2] is approximable within factor f{d, a). 

The proof of the above theorem is again based on the computation of a MST of the input 
graph: Indeed, the algorithm is exactly the same. Unfortunately, the following result 
(whose proof is based on results in [8,13,25] and will be given in the full version of the 
paper) shows that the function / in the statement of the theorem grows exponentially 
with respect to d. 

Theorem 7. There exists a positive constant 7 such that, for any d and for any k, an 
instance Xk,d 0 / MECBS[NJ^] exists such that opt(xfc_d) = while the cost of the 
range assignment computed by Mst-Alg is at least k‘^ • 2'*‘^. 
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One could also ask whether our algorithm approximates MECBS[N2] in the case 
in which d > 2 and a < d. Unfortunately, it is not difficult to produce an instance 
X such that opt(x) = while the cost of the range assignment computed by 

Mst-Alg is f2{n), where n denotes the number of vertices: For example, in the case 
d = 2, we can just consider the two dimensional grid of side -y/n and the source node 
positioned on its center. 

Open Problems. Three main problems are left open by this paper. The first one is to 
improve the analysis of Mst-Alg (or to develop a different algorithm with a better 
performance ratio). Actually, we have performed several experiments and it turns out 
that the practical value of the performance ratio of Mst-Alg (in the case in which 
d = 2 and a = 2) is between 2 and 3. The second open problem is to analyze the 
approximability properties of MECBS[N^] when a < d: In particular, it would be very 
interesting to study the three-dimensional case. As previously observed, the MST-based 
algorithm does not guarantee any approximation, and it seems thus necessary to develop 
approximation algorithms based on different techniques. The last open problem is to 
consider MECBS[N“], for any a> 1: In particular, we conjecture that this problem is 
solvable in polynomial time. 
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Abstract. Using an automata-theoretic approach, we investigate the 
decidability of liveness properties (called Presburger liveness properties) 
for timed automata when Presburger formulas on configurations are 
allowed. While the general problem of checking a temporal logic such 
as TPTL augmented with Presburger clock constraints is undecidable, 
we show that there are various classes of Presburger liveness properties 
which are decidable for discrete timed automata. Eor instance, it is decid- 
able, given a discrete timed automaton A and a Presburger property P, 
whether there exists an o;-path of A where P holds infinitely often. We 
also show that other classes of Presburger liveness properties are indeed 
undecidable for discrete timed automata, e.g., whether P holds infinitely 
often for each i<;-path of A. These results might give insights into the cor- 
responding problems for timed automata over dense domains, and help 
in the definition of a fragment of linear temporal logic, augmented with 
Presburger conditions on configurations, which is decidable for model 
checking timed automata. 



1 Introduction 

Timed automata [3] are widely regarded as a standard model for real-time sys- 
tems, because of their ability to express quantitative time requirements in the 
form of clock regions: a clock or the difference of two clocks is compared against 
an integer constant, e.g., x — y > 5, where x and y are clocks. A fundamental re- 
sult in the theory of timed automata is that region reachability is decidable. This 
has been proved by using the region technique [3] . This result is very useful since 
in principle it allows some forms of automatic verification of timed automata. In 
particular, it helps in developing a number of temporal logics [2,6,13,15,4,16], in 
investigating the model-checking problem and in building model-checking tools 
[12,17,14] (see [1,18] for surveys). 

In real-world applications [7], clock constraints represented as clock regions 
are useful but often not powerful enough. For instance, we might want to argue 
whether a non-region property such as xi — X 2 > X 3 — X 4 (i.e., the difference 
of clocks xi and X2 is larger than that of X3 and X4) always holds when a 
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timed automaton starts from clock values satisfying another non-region property. 
Hence, it would be useful to consider Presburger formulas as clock constraints, 
were it not for the fact that a temporal logic like TPTL [6] is undecidable 
when augmented with Presburger clock constraints [6]. However, recent work 
[9,10] has found decidable characterizations of the binary reachability of timed 
automata, giving hope that some important classes of non-region properties are 
still decidable for timed automata. 

In this paper, we look at discrete timed automata (dta), i.e., timed automata 
where clocks are integer- valued. Discrete time makes it possible to apply, as un- 
derlying theoretical tools, a good number of automata-theoretic techniques and 
results. Besides the facts that discrete clocks are usually easier to handle than 
dense clocks also for practitioners, and that dtas are useful by themselves as a 
model of real-time systems [5], results on dtas may give insights into correspond- 
ing properties of dense timed automata [11]. 

The study of safety properties and liveness properties is of course of the 
utmost importance for real-life applications. In [10] (as well as in [9]), it has been 
shown that the Presburger safety analysis problem is decidable for discrete timed 
automata. That is, it is decidable whether, given a discrete timed automaton A 
and two sets I and P of configurations of A (tuples of control state and clock 
values) definable by Presburger formulas, A always reaches a configuration in P 
when starting from a configuration in I. 

In this paper we concentrate on the Presburger liveness problem, by system- 
atically formulating a number of Presburger liveness properties and investigat- 
ing their decidability. For instance, we consider the 3-Presburger-i.o. problem: 
whether there exists an w-path p for A such that p starts from I and P is satis- 
fied on p infinitely often. Another example is the V-Presburger-eventual problem: 
whether for all w-paths p that start from I, P is eventually satisfied on p. 

The main results of this paper show that (using an obvious notation, once it 
is clear that 3 and V are path quantifiers): 

— The 3-Presburger-i.o. problem and the 3-Presburger-eventual problem are 
both decidable. So are their duals, the V-Presburger-almost-always problem 
and the V-Presburger-always problem. 

— The V-Presburger-i.o. problem and the V-Presburger-eventual problem are 
both undecidable. So are their duals, the 3-Presburger-almost-always prob- 
lem and the 3-Presburger-always problem. 

These results can be helpful in formulating a weak form of a Presburger linear 
temporal logic and in defining a fragment thereof that is decidable for model- 
checking dta. The proofs are based on the definition of a version of dta, called 
static dta, which does not have enabling conditions on transitions. The decidabil- 
ity of the previous Presburger liveness problems is the same for dta and static 
dta. Hence, proofs can be easier, since static dta are much simpler to deal with 
than dta. 

The paper is organized as follows. Section 2 introduces the main definitions, 
such as discrete timed automata and the Presburger liveness properties. Section 
3 shows the decidability of the 3-Presburger-i.o. and of the 3-Presburger-eventual 
problems, by introducing static dta. Section 4 shows the undecidability of the 
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V-Presburger-i.o. and of the V-Presburger-eventual problems. Section 5 discusses 
some aspects related to the introduction of Presburger conditions in temporal 
logic, and to the extension of our results to dense time domains. 

The proofs of some lemmas and theorems can be found in the full version of 
the paper available at http://www.eecs.wsu.edu/~zdang. 

2 Preliminaries 

A timed automaton [3] is a finite state machine augmented with a number of 
real-valued clocks. All the clocks progress synchronously with rate 1, except when 
a clock is reset to 0 at some transition. In this paper, we consider integer-valued 
clocks. A clock constraint (or a region) is a Boolean combination of atomic clock 
constraints in the following form: x#c, x — y4l=c where # denotes <,>,<,>, or 
=, c is an integer, x,y are integer-valued clocks. Let £x be the set of all clock 
constraints on clocks X. Let N be the set of nonnegative integers. 

Definition 1. A discrete timed automaton (dta) is a tuple A = (S,X,E) where 
S is a finite set of (control) states, X is a finite set of clocks with values in N, 
and E C S X 2^ X Lx x S is a finite set of edges or transitions. 

Each edge (s. A, I, s') denotes a transition from state s to state s' with enabling 
condition I G Lx and a set of clock resets X C X. Note that A may be empty: in 
this case, the edge is called a clock progress transition. Since each pair of states 
may have more than one edge between them, in general A is nondeterministic. 

The semantics of dtas is defined as follows. We use A,B,V,W,X,Y to 
denote clock vectors (i.e., vectors of clock values) with Vx being the value of 
clock X in V. I denotes the identity vector in i.e., Ja, = 1 for each x G X. 

Definition 2. (Configuration, One-Step Transition Relation — A configura- 
tion (s, P) G S X (N)l^l is a tuple of a control state s and a clock vector V. 
{s,V) — {s' ,V) denotes a one-step transition from configuration {s,V) to 
configuration {s', V) satisfying all the following conditions: 

— There is an edge (s. A, I, s') in A connecting state s to state s' , 

— The enabling condition of the edge is satisfied, that is, 1{V) is true, 

— Each clock changes according to the edge. If there are no clock resets on the 
edge, i.e., A = 0, then clocks progress by one time unit, i.e., V' = V -\- 1 . If 
A yf 0, then for each x G X, V'^ = 0 while for each x ^ X, V'^. = V^. 

A configuration (s, V) is a deadlock configuration if there is no configuration 
{s' ,V) such that {s,V) {s' ,V'). A is total if every configuration is not a 

deadlock configuration. A path is a finite sequence (sq, P°) • • • {sk,V^) such that 
{si,V') for each 0<i<k — 1. A path is a progress path if 

there is at least one clock progress transition on the path. An to-path is an infinite 
sequence (sq, P°) • • • {sk, V^) ■ ■ ■ such that each prefix (sq, P°) • • • {sk,V^) is a 
path. An w-path is divergent if there is an infinite number of clock progress 
transitions on the w-path. Without loss of generality, in this paper we consider 
timed automata without event labels [3], since they can be built into the control 
states. 
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Let F be a finite set of variables over integers. For all integers Uy with y gY, 
b and c (with c > 0), '^y^y^yV < ^ is an atomic linear relation on Y and 
J2y^y O'yV =& c is a linear congruence on Y . A linear relation on F is a Boolean 
combination (using ^ and A) of atomic linear relations on F. A Presburger 
formula on F is the Boolean combination of atomic linear relations on F and 
of linear congruences on F. A set P is Presburger- definable if there exists a 
Presburger formula fF on F such that P is exactly the set of the solutions for F 
that make T true. Since Presburger formulas are closed under quantifications, 
we will allow quantifiers over integer variables. 

Write (s, V) (s', V) if (s, V) reaches (s', V) through a path in A. The 

binary relation can be considered as a subset of configuration tuples and 

called binary reachability. It has been shown recently that, 

Theorem 1. The binary reachability'^-^ is Presburger-definable [9,10]. 

The Presburger safety analysis problem is to consider whether A can only reach 
configurations in P starting from any configuration in I, given two Presburger- 
definable sets I and P of configurations. Because of Theorem 1, the Presburger 
safety analysis problem is decidable [10] for dtas. 

In this paper, we consider Presburger liveness analysis problems for dtas, 
obtained by combining a path-quantifier with various modalities of satisfaction 
on an w-path. Let I and P be two Presburger-definable sets of configurations, 

and let p be an w-path (sg, V'°), (si, F^) Define the following modalities of 

satisfactions of P and I over p: 

— p is P-i.o. if P is satisfied infinitely often on the w-path, i.e., there are 
infinitely many k such that {sk, V^) G P. 

— p is P-always if for each k, {sk, V^) G P. 

— p is P-eventual if there exists k such that {sk, V^) G P. 

— p is P- almost- always if there exists k such that for all k' > k, {sk',V^ ) G P. 

— p starts from / if {sq, F°) G I. 

Definition 3. (Presburger Liveness Analysis Problems) Let Abe a dta and let I 
and P be two Presburger-definable sets of configurations of A. The 3-Presburger- 
i.o. (resp. always, eventual and almost-always) problem is to decide whether the 
following statement holds: there is an uj-path p starting from I that is P-i.o. 
(resp. P-always, P-eventual and P -almost-always) . The^-Presburger-i.o. (resp. 
always, eventual and almost-always) problem is to decide whether the following 
statement holds: for every uj-path p, if p starts from I, then p is P-i.o. (resp. 
always, eventual and almost-always) . 

3 Decidability Results 

In this section, we show that the 3-Presburger-i.o. problem is decidable for dtas. 
Proofs of an infinitely-often property usually involve analysis of cycles in the 
transition system. However, for dtas, this is difficult for the following reasons. A 
discrete timed automaton A can be treated as a transition graph on control states 
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with clock reset sets properly assigned to each edge, and augmented with tests 
(i.e., clock constraints) on edges. The tests are dynamic - the results of the tests 
depend upon the current values of each clock and obviously determine which 
edges may be taken. This is an obstacle to applying cyclic analysis techniques 
on the transition graph of A. 

A solution to these difficulties is to introduce static discrete timed automata, 

i.e., dtas with all the enabling conditions being simply true. The lack of enabling 
conditions simplifies the proof that the 3-Presburger-i.o. problem is decidable 
for static dtas. Then, we show that each 3-Presburger-i.o. problem for a dta can 
be translated into an 3-Presburger-i.o. problem for a static dta, and hence it is 
decidable as well. 

3.1 The 3-Presburger-i.o. Problem for Static dtas 

Let A be a static dta. We show that the 3-Presburger-i.o. problem for static dtas 
is decidable. Given two sets I and P of configurations definable by Presburger 
formulas, an w-path p = {sq, • • • {sk, V^) • • • is a witness if it is a solution of 
the 3-Presburger-i.o. problem, i.e., p is P-i.o. and p starts from I {{sq, y°) G I). 
There are two cases to a witness p: (1) p is not divergent; (2) p is divergent. For 
Case (1), we can establish the following lemma by expressing the existence of p 
into a Presburger formula obtained from the binary reachability of A. 

Lemma 1. The existence of a non-divergent witness is decidable. 

The difficult case, however, is when the witness p is divergent. The remainder 
of this subsection is devoted to the proof that the existence of a divergent witness 
is decidable. For now, we fix a choice of a control state s and a set Xr C A of 
clocks (there are only finitely many of them). To ensure that p is divergent, each 
path from (s^. = s,V^') to (sfej+i = s,V^'+^) is picked so that it contain at 
least one clock progress transition, i.e., a progress cycle, as follows. 

Definition 4. For all clock vectors V, V , we write {s, V) (s, V) if 

1. there exists G I such that (sq, V°) {s,V), i.e., {s,V) is reach- 

able from a configuration in I, 

2. {s, V) e P, 

3. {s, V) {s, V) through a progress path on which all the clocks in Xr are 
reset at least once and all the clocks not in Xr are never reset. 

The proof proceeds as follows. First, we show (Lemma 2) that the relation 
is Presburger-definable. Then, since A is finite state, there exists a P-i.o. 
w-path p iff there is a state s such that P holds infinitely often on p at state s. 
This is equivalent to saying (Lemma 3) that there exist clock vectors , . . . 

such that (s, V*) -^Xr ®^ch t > 0. Since the actual values of the 

clocks in Xr may be abstracted away (Lemma 4 and Definition 5) and the clocks 
in A — Xr progress synchronously, this is equivalent to saying that there exist 
V ,df > Q,df > 0, . . . such that = Vx + d^ for all x G A — Aj.(Lemma 5). The 
set {d*} may be defined with a Presburger formula, as shown in Lemma 7, since 
each d* may always be selected to be of the form c* 3- /(c*), where the set {c*} 
is a periodic set (hence, Presburger definable) and / is a Presburger-definable 
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function. This is based on the fact that static automata have no edge conditions, 
allowing us to increase the length d of a progress cycle to a length nd (Lemma 6), 
for every n > 0. The decidability result on the existence of a divergent witness 
follows directly from Lemma 7. 

Lemma 2. is Preshurger- definable. That is, given s G S, {s, V) 

{s, V) is a Presburger formula, when the clock vectors V , V are regarded as 
integer variables. 

Based upon the above analysis, the following lemma is immediate: 

Lemma 3. There is a divergent witness p iff there are s, Xr and clock vectors 
V^, V^, . . . such that (s, V*) (s, for each i > 0. 

is,V) (s, W) denotes the following scenario. Starting from some configu- 

ration in I, A can reach (s, V) and return to s again with clock values W . The 
cycle at s is a progress one such that each clock in Xr resets at least once and all 
clocks not in Xr do not reset. Since A is static, the cycle can be represented by 
a sequence spsi • • • s* of control states, with sq = St = s, and such that, for each 
0 < i < t, there is an edge in A connecting Si and Si+i. Observe that, since each 
x G Xj. is reset in the cycle, the starting clock values Vx for cc G X^. at sq = s are 
insensitive to the ending clock values Wx with x G Xr at St = s (those values 
of Wx only depend on the sequence of control states). We write V =x-Xr U 
if V and U agree on the values of the clocks not in X^, i.e., Vx = U x, for each 
X € X — Xr. The insensitivity property is stated in the following lemma. 

Lemma 4. For all clock vectors U, V, W, if {s, V) '^x 

is reachable from some configuration in I with V =x-Xr U , then (s, U) '^Xr 

{s,W). 

Also note that, since all clocks not in Xr do not reset on the cycle, the 
differences Wx — Vx for each x € X — Xr are equal to the duration of the cycle 
(i.e., the number of progress transitions in the cycle). The following technical 
definition allows us to “abstract” clock values for Xr away in (s, V) '^x (■®; W). 

Definition 5. For all clock vectors V and for all positive integers d, we write 

Y Y +dl if there exist two clock vectors V and W such that (s, V) '^Xr 
(s, W) with Y =x-Xr V and Y + dl =x-Xr W. 

Obviously, in the previous definition, the cycle from (s, V) to (s, W) has 
duration d. Also, the relation ^ is Presburger-definable (over Y and d) . 

Lemma 5. There exists a divergent witness for A if, and only if, there are s,Xr, 
Y, d^,df, . . . such that 0 < d^ < d^ < . . . and Y + d^I ^ for 

each i>\. 

The following technical lemma, based on Definition 5 and Lemma 5, will be 
used in the proof of Lemma 7. 

Lemma 6. For all Y , Y' , and for all n > l,d > 0, if Y 

Y + ndl '^f.^Xr) ^den Y + dl 



Y + dl and 
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Lemma 7. It is decidable whether there exists a divergent witness for a static 
dta A. 

Proof. We claim that, there are s,Xr such that the Presburger formula 
(*) 3yVm > 03di > m3d2 >0 ( Y + dil y+{di + d2)I ) 

holds if and only if there is a divergent witness for A. The statement of the 
lemma then follows immediately. 

Assume there is a divergent witness. Hence, by Lemma 3, there exist V^, 

. . . and d^,d^, . . . such that, for each i> I, {s, V*) with a progress 

cycle of duration d* > 0. Let Y be such that Y =x-Xr By Definition 5, 
Y + I Xr) ^ ■*" (S}=i ^ i> 1. For each m > 0, let 

di = J2jLi d^, d 2 = It is immediate that (*) holds. 

Conversely, let i^o be one of the vectors Y such that (*) holds. Apply skolem- 

ization to the formula 3d2 > 0 + dil x^) + (di + d 2 )/^, by intro- 

ducing a function /(di) to replace the variable d 2 - Since (*) holds, then the 
formula id(di), defined as Yq + dil x^) ^ o + {di + /(di))J, holds for in- 
finitely many values of di. Combining the fact that H{d\) is Presburger-definable 
(because IFo is fixed), there is a periodic set included in the infinite domain of 
H, i.e., there exist n > 1, fc > 0 such that for all d > 0 if d =„ k then id(d) holds. 
Let c° be any value in the periodic set, and let c* = + n/(c*“^), for every 

t > 1. Obviously, every c* satisfies the periodic condition: c* =„ k, and therefore 
id(c*) holds. Hence, for every i > 1, + c*/ Xr.) ^ o + {A + f{c^))I- 

Since Fq + c*+^I = Fq + c*I + nf{d)I Vq + (C+i + f{A+^))I, we 

may apply Lemma 6, with: Y = Yq + PI, d = /(c*), Y' = Y -|-(c*+^ + /(c*+^))J. 
Lemma 6 then gives Y + dl Y', i.e., yo + (c* 3- fi,P))I Xr) + 

+ /(c*“*'^))J, for every i > 1. By Lemma 5, with d* = c* + /(c*), there is a 
divergent witness. I 

By Lemmas 1 and 7, we have: 

Theorem 2. The 3-Presburger-i.o. problem is decidable for static dtas. 

3.2 The 3-Presburger-i.o. Problem for dtas 

In the full paper, we use a technique modified from [10] to show that the tests in 
A can be eliminated. That is, A can be effectively transformed into Al' where all 
the tests are simply true and A!' has (almost) the same static transition graph as 
A. This is based on an encoding of the tests of A into the finite state control of 
A!' ■ Now we look at the 3-Presburger-i.o. problem for A. Recall that the problem 
is to determine, given two Presburger-definable sets I and P of configurations of 
A, whether there exists a P-i.o. w-path p starting from I. We relate the instance 
of the 3-Presburger-i.o. problem for A to an instance of the 3-Presburger-i.o. 
problem for AT \ 

Lemma 8. Given a dta A, and two Presburger-definable sets I and P of con- 
figurations of A, there exist a static dta A" and two Presburger definable sets 
I" and P" of configurations of A" such that: the existence of a witness to the 
3-Presburger-i.o. for A, given I and P, is equivalent to the existence of a witness 
to the 3-Presburger-i.o. for A" , given I" and P" . 
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Since A” is a static dta, the decidability of the 3-Presburger-i.o. for A follows 
from Theorem 2 and Lemma 8. 

Theorem 3. The 3-Presburger-i.o. problem and the^ -Presburger-almost-always 
problem are decidable for dtas. 

3.3 Decidability of the 3-Presburger-Eventual Problem 

Given a dta A, and two Presburger-definable sets I and P of configurations, the 
3-Presburger-eventual problem is to decide whether there exists a P-eventual co- 
path p starting from I. Define /' to be the set of all configurations in P that can 
be reached from a configuration in I. From Theorem 1, J' is Presburger-definable. 
Let P' be simply true. It can be shown that the existence of a witness for the 
3-Presburger-eventual problem (given I and P) is equivalent to the existence of 
a witness for the 3-Presburger-i.o. problem (given P and P'). From Theorem 3, 



Theorem 4. The 3-Presburger-eventual problem and the \/-Presburger-always 
problem are decidable for dtas. 

It should be noted that there is a slight difference between the V-Presburger- 
always problem and the Presburger safety analysis problem mentioned before. 
The difference is that the Presburger safety analysis problem considers (finite) 
paths while the V-Presburger-always problem considers w-paths. 

4 Undecidability Results 

The next three subsections show that the undecidability of the V-Presburger- 
eventual problem and of the V-Presburger-i.o. problem. We start by demonstrat- 
ing the fact that a two-counter machine can be implemented by a generalized 
version of a dta. This fact is then used in the following two subsections to show 
the undecidability results. 

4.1 Counter Machines and Generalized Discrete Timed Automata 

Consider a counter machine M with counters xi, - ■ ■ ,Xk over nonnegative inte- 
gers and with a finite set of locations {li, • • • , l„}. M can increment, decrement 
and test against 0 the values of the counters. It is well-known that a two-counter 
machine can simulate a Turing machine. 

We now define generalized discrete timed automata. They are defined simi- 
larly to dtas but for each edge (s. A, I, s') the formula I is of the form aiXiffc, 
where and c are integers. Generalized dtas are Turing-complete, since they 
can simulate any counter machine: 

Lemma 9. Given a deterministic counter machine M , there exists a determin- 
istic generalized dta that can simulate M. 

From now on, let M he a deterministic counter machine and let A be a 
deterministic generalized dta that implements M. We may assume that A is total 
(i.e., there are no deadlock configurations), since A can be made total by adding a 
new self- looped state Sf, and directing every a deadlock configuration to this new 
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state. Now we define the static version A~ , to which A can be modified as follows. 
A~ is a discrete timed automaton with the enabling condition on each edge being 
simply true. Each state in A~ is a pair of states in A. ((si, s'l), Xi,true, {s 2 , s' 2 )) 
is an edge of A~ iff there are edges (si, Ai, li, and {s 2 , X 2 ,h, s' 2 ) in A with 
s'l = S 2 ■ We define a set P, called the path restriction of A, of configurations 
of A~ as follows. For each configuration {{s,s'),V) of A~ , {{s,s'),V) € P iff 
there exists an edge e = (s, A, / , s') such that the clock values V satisfy the 
linear relation I in e. Clearly, P is Presburger-definable. Since A is total and 
deterministic, the above edge e always exists and is unique for each configuration 
(s, V) of A. Using this fact, we have. 

Theorem 5. Let A he a total and deterministic generalized dta with path re- 
striction P, and let A~ he the static version of A. An co-sequence (sq, 

(sfc, V^) ■ ■ ■ is an co-path of A iff {{sq, si), V°) • • • {{sk, Sk+i), V*) ■ ■ ■ is an co-path 
of A~ with ((sfc, Sfc+i), V^) G P for each k. 

4.2 Undecidability of the V-Presburger-Eventual Problem 

We consider the negation of the V-Presburger-eventual problem, i.e., the 3- 
Presburger-always problem, which can be formulated as follows: given a discrete 
timed automaton A and two Presburger-definable sets I and P of configurations, 
decide whether there exists a ^P-always w-path of A starting from I. 

Consider a deterministic counter machine M with the initial values of the 
counters being 0 and the first instruction labeled Iq. Let A be the deterministic 
generalized dta implementing M, as defined by Lemma 9, with P being the 
path restriction of A. As before, A is total. Let A~ be the static version of A. 
It is well known that the halting problem for (deterministic) counter machines 
is undecidable. That is, it is undecidable, given M and an instruction label I, 
whether M executes the instruction 1. Define P' to be the set of configurations 
((s, s'),V) G P with s yf L Let I be the set of initial configurations of A~ with 
all the clocks being 0 and the first component of the state (note that each state 
in A~ is a state pair of A) being Iq. I is finite, thus Presburger-definable. From 
Theorem 5 and the fact that A implements M, we have: M does not halt at I iff 
A~ has a P'-always w-path starting from a configuration in I. Thus, we reduce 
the negation of the halting problem to the 3-Presburger-always problem for dtas 
with configuration sets P' and I. Therefore, 

Theorem 6. The 3-Preshurger-always problem and the y-Preshurger- eventual 
problem are undecidable for discrete timed automata. 

4.3 Undecidability of the V-Presburger-i.o. Problem 

In this subsection, we show that the 3-Presburger-almost-always problem is un- 
decidable. Therefore, the V-Presburger-i.o. problem is also undecidable. In the 
previous subsection, we have shown that the existence of a P-always w-path of 
A is undecidable. But this result does not directly imply that the existence of a 
P-almost-always w-path is also undecidable. 

In fact, let A~ be the static version of a generalized discrete timed automa- 
ton A that implements a deterministic counter machine M, let P be the path 
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restriction of A, and let p be an w-path of A~ . In the previous subsection, we 
argued that the existence of a P'-always w-path p is undecidable where P' is 
P n {((s, s'), V) : s 1} with I being a given instruction label in M. But when 
considering a P'-almost-always path p, the situation is different: p may have 
a prefix that does not necessarily satisfy P' (i.e., it does not obey the exact 
enabling conditions on the edges in A). 

Consider a deterministic two-counter machine M with an input tape, and 
denote with M{i) the result of the computation of M when given t G N in input. 
It is known that the finiteness problem for deterministic two-counter machines 
(i.e., finitely many i such that M{i) halts) is undecidable. Now we reduce the 
finiteness problem to the 3-almost-always problem for dtas. 

We can always assume that M halts when and only when it executes an 
operation labeled halt. Let M' be a counter machine (without input tape) that 
enumerates all the computations of M on every t G N. M' works as follows. 
We use Mj{i) to denote the j-th step of the computation of M{i). If M{i) 
halts in less than j steps, then we assume that Mj{i) is a special null opera- 
tion that does nothing. Thus, the entire computation of M(i) is an w-sequence 
■ ■ , Mj{i), ■ ■ ■ (when M{i) halts, the sequence is composed of a finite pre- 
fix, the halt operation and then infinitely many occurrences of the special null 
operation). Each step of the computation may or may not execute the instruc- 
tion labeled halt, but of course an halt may be executed only at most once for 
each input value i. M' implements the following program: 
fc := 0; 2 := 0; 
while true do 
k := k + 1\ 

for i -.= Q to k — 1 do 
z := 1; 

simulate M{i) for the first k steps Mi{i), M 2 (i), ■ ■ ■ , Mk{i); 
if Mk(i) executes the instruction labeled halt, then z := 0; 

M' is still a deterministic counter machine (with various additional counters to 
be able to simulate M and keep track of k,i,z). In the enumeration, whenever 
Mk{i) executes the instruction labeled halt (at most once for each i, by the 
definition of M' as above), M' sets the counter z to 0, bringing it back to 1 
immediately afterwards - M' resets z to 0 for only finitely many times iff the 
domain of M (i.e., the set of i such that M{i) halts) is finite. Let A~ be the 
static version of a generalized discrete timed automaton A that implements M' . 
Let P be the path restriction of A. P' is P f] {((s, s'),V) : Vz yf 0}. It can be 
established, by using Lemma 9 and Theorem 5, that there are only finitely many 
i such that M{i) halts iff A~ is 3-Presburger-almost-always for P' and I where 
/ contains only the initial configuration. Therefore, 

Theorem 7. The 3-Preshurger-almost-always problem and the V-Presburger- 
i.o. problem are undecidable for discrete timed automata. 

5 Discussions and Future Work 

It is important to provide a uniform framework to clarify what kind of temporal 
Presburger properties can be automatically checked for timed automata. Given 
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a dta A, the set of linear temporal logic formulas C_a with respect to A is defined 
by the following grammar: (j) := A </>| Q 4>\4>U4>, where P is a Presburger- 

definable set of configurations of A, Q denotes “next” , and U denotes “until” . 
Formulas in >C _4 are interpreted on w-sequences p of configurations of Al in a 
usual way. We use p* to denote the w-sequence resulting from the deletion of the 
first i configurations from p. We use pi to indicate the i-th element in p. The 
satisfiability relation \= is recursively defined as follows, for each w-sequence p 
and for each formula </> G (written p \= (f>): 
p \= P a Pi € P, 
p 1 = -10 if not p \= (p, 
p \= pi A (j )2 if p \= <pi and p ^ 02 , 
p\=OPHp^ 

p 1= piUp 2 if 3j{p^ h 02 and V/c < |= 0i)). 

where the variables j, k range over N. We adopt the convention that O0 (even- 
tual) abbreviates (truellp) and Up (always) abbreviates {-^0~<p). 

Given A and a formula p G the model-checking problem is to check 
whether each w-path p of A satisfies p |= 0. The satisfiability-checking problem, 
which is the dual of the model-checking problem, is to check whether there is an 
w-path p of A satisfying p |= 0. The results of this paper show that: 

— The satisfiability-checking problem is decidable for formulas in in the 
form I A DOP and I A OP, where I and P are Presburger. 

— The model-checking problem is undecidable for formulas in P_ 4 , even when 
the formulas are in the form DOP and OP. 

— Hence, both the satisfiability-checking problem and the model-checking prob- 
lem are undecidable for the entire even when the “next” operator Q is 
excluded from the logic Ca- 

Future work may include investigating a fragment of Ca that has a decid- 
able satisfiability-checking/model-checking problem. For instance, we don’t know 
whether the satisfiability-checking problem is decidable for I A DOP A DOQ (i.e., 
find an w-path that is both P-i.o. and Q-i.o). A decidable subset of La may be 
worked out along the recent work of Comon and Cortier [8] on model-checking a 
decidable subset of a Presburger (in the discrete case) LTL for one-cycle counter 
machines. 

In [6], an extension of TPTL, called Presburger TPTL, is proposed and it is 
shown to be undecidable for discrete time. The proof in [6] does not imply (at 
least, not in an obvious way) the undecidability of the V-Presburger-i.o. problem 
and the V-Presburger-eventual problem in the paper. In that proof, the semantics 
of Presburger TPTL (over discrete time domain) is interpreted on timed state 
sequences. The transition relation of a two-counter machine can be encoded 
into Presburger TPTL by using Q),U and the freeze quantifier. This gives the 
undecidability of the logic [6]. On the other hand, DOP and OP in this paper are 
interpreted on sequences of configurations (in contrast to timed state sequences). 
Formulas like DOP and OP are state formulas. That is, without using Q ^ind 
without introducing freeze quantifiers, we have no way to remember clock values 
in one configuration and use them to compare those in another configuration 
along p. Therefore, the transition relation of a two-counter machine cannot be 
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encoded in our logic Cj^. But we are able to show in this paper that computations 
of a two-counter machine can be encoded by w-paths, restricted under DOP or 
OP, of a dta, leading to the undecidability results of this paper. 

We are also interested in considering the same set of liveness problems for a 
dense time domain. We believe that the decidability results (for the 3-Presburger- 
i.o. problem and the 3-Presburger-eventual problem) also hold for dense time 
when the semantics of a timed automaton is carefully defined. A possible ap- 
proach is to look at Comon and Jurski’s flattening construction [9]. The undecid- 
ability results in this paper can be naturally extended to the dense time domain 
when the w-paths in this paper are properly redefined for dense time. 

Thanks to the anonymous reviewers for a number of useful suggestions. 
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Abstract. We introduce a subclass of non deterministic finite automata (NFA) 
that we call Residual Finite State Automata (RFSA): a RFSA is a NFA all the 
states of which define residual languages of the language it recognizes. We prove 
that for every regular language L, there exists a unique RFSA that recognizes 
L and which has both a minimal number of states and a maximal number of 
transitions. Moreover, this canonical RFSA may be exponentially smaller than 
the equivalent minimal DFA but it also may have the same number of states as 
the equivalent minimal DFA, even if minimal equivalent NFA are exponentially 
smaller. We provide an algorithm that computes the canonical RFSA equivalent 
to a given NFA. We study the complexity of several decision and construction 
problems linked to the class of RFSA: most of them are PSPACE-complete. 



1 Introduction 

Regular languages and finite automata have been extensively studied since the begin- 
ning of formal language theory. Representation of regular languages by means of Deter- 
ministic Finite Automata (DFA) has many nice properties: there exists a unique minimal 
DFA that recognizes a given regular language (minimal in number of states and unique 
up to an isomorphism); each state q of a DFA A defines a language (composed of the 
words which lead to a final state from q) which is a natural component of the language 
L recognized by A, namely a residual language of L. One of the major drawbacks of 
DFA is that they provide representations of regular languages whose size is far to be 
optimal. For example, the regular language S*0E'^ is represented here by a regular 
expression whose size is 0(log n) while its minimal DFA has about 2" states. Using 
Non deterministic Finite Automata (NFA) rather than DFA can drastically improve the 
size of the representation: the minimal NFA which recognizes 27*0 A'" has n + 2 states. 
However, NFA have none of the two above-mentioned properties: languages associated 
with states have no natural interpretation and two minimal NFA can be not isomorphic. 

In this paper, we study a subclass of non deterministic finite automata that we call 
Residual Finite State Automata (RFSA). By definition, a RFSA is a NFA all the states 
of which define residual languages of the language it recognizes. More precisely, a NFA 
A = (27, Q, Qo, F, 6) is a RFSA if for every state q in Q there exists a word u such 
that uv is recognized by A if and only if reading v, a final state can be reached from q. 
Clearly, all DFA are RFSA but the converse is false. 

We prove that among all the RFSA which recognize a given regular language, there 
exists a unique element which has both a minimal number of states and a maximal 
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number of transitions. This canonical RFSA may be exponentially smaller than the 
equivalent minimal DFA (for example, the canonical RFSA which recognizes A'*0A'" 
has n + 2 states); but it may also have the same number of states as the equivalent mini- 
mal DFA, even if minimal equivalent NFA are exponentially smaller. Another approach 
of canonical NFA can be found in [Car70] and [ADN92]. 

It is well known that for a given DFA A recognizing a language L, if we first con- 
struct the mirror automaton A and then, the deterministic automaton equivalent to A 
using the standard subset construction technique, we obtain the minimal DFA for L. 
We prove a similar property for RFSA. This property provides an algorithm which 
computes the canonical RFSA equivalent to a given NFA. Unfortunately, we also prove 
that this construction problem is PSPACE-complete, as most of the constructions we 
define in this paper. 

In section 2, we recall classical definitions and notations about regular languages 
and automata. We define RFSA in section 3 and we study their properties in section 4. 
In particular, we introduce the notion of canonical RFSA. We provide a construction 
of the canonical RFSA from a given NFA in section 5. In section 6, we study some 
particular (and pathological) RFSA. Section 7 is devoted to the study of the complexity 
of our constructions. Finally, we conclude by indicating where this work originates from 
and by describing some of its applications in the field of grammatical inference. 

2 Preliminaries 

In this section, we recall some definitions concerning finite automata. For more infor- 
mation, we invite the reader to consult [HU79,Yu97]. 

2.1 Automata and Languages 



0,1 




Fig. 1. Ai Automaton Recognizes E*0S but Is neither a DFA nor a RFSA 



1 



0 




Fig. 2. A 2 Is the Minimal DFA Recognizing 
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0,1 



0 





Fig. 3. ^3 Is a RFSA Recognizing 

Let F7 be a finite alphabet, and let S* be the set of words on S. We note £ the empty 
string and |m| the length of a word rt in 17*. A language is a subset of S*. 

A non deterministic finite automaton (NFA) is a quintuple A = (L7, Q, Qq, F, S) 
where Q is a finite set of states, Qo C Q is the set of initial states, F C Q is the set 
of terminal states. S is the transition function of the automaton defined from a subset of 
Q X E to 2'^. We also note 6 the extended transition function defined from a subset of 
2'3 X r* to 2<3 by: 

= U}, 

= S{q,x), 

= U{( 5 ({( 7 },'u)|( 3 ' G Q'} and 

b({q},ua:) = S{S{q,u),x) 
where Q'CQ,xGE,qGQ and u G S*. 

A NFA is deterministic (DFA) if Qo contains exactly one element qo and if Vq G Q, 
Vx G 27, Card{6{q, x)) < 1. A NFA is trimmed if Vq G Q, G 27*, q G b(Qo: i«i) 
and 3w2 G 27*, 5{q, W 2 ) n F 7 ^ 0. A state q is reachable by the word u if q G b(Qo, u). 

A word u G 27* is recognized by a NFA if b((5o, u) n F 7 ^ 0 and the language 
La recognized by A is the set of words recognized by A. We denote by Rec{S*) the 
class of recognizable languages. It can be proved that every recognizable language can 
be recognized by a DFA. There exists a unique minimal DFA that recognizes a given 
recognizable language (minimal with regard to the number of states and unique up to 
an isomorphism). Finally, the Kleene theorem [Kle56] proves that the class of regular 
languages Reg{E*) is identical to Rec{S*). 

The mirror of a word u = xi . . . Xn (xi G 27) is defined by u = . . . xi. 

The mirror of a language L is L = {u|u G L}. The mirror of an automaton A = 
(27, Q, Qo, F, b) is A = (27, Q, F, Qo,6) where q G b(q', x) if and only if q' G 6{q, x). 
It is clear that La = L^. 

Let F be a regular language. Let A = (27, Q, Qo, F, 5) be a NFA that recognizes L 
and let Q' C Q. We note Lqi the language defined by Lq' = {v\5{Q' , w) n F 7 ^ 0}. 
When Q' contains exactly one state q, we simply denote Lqi by Lq. 

2.2 Residual Languages 

Let F be a language over 27* and let u G 27*. The residual language of F with re- 
gard to u is defined by u~^L = {u G 27* | uw G F}. If F is recognized by a NFA 
(27, Q, Qo, F, 6), then q G b(Qo, u) ^ Lq C u~^L. 
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The Myhill-Nerode theorem [Myh57,Ner58] proves that the set of distinct residual 
languages of any regular language is finite. Furthermore, if A= {S, Q, Qq, F, 6) is the 
minimal DFA recognizing L, we have: 

- for every non empty residual language u~^L, there exists a unique q G Q such that 
Lg = u~^L, 

- yqG Q, there exists a unique residual language u~^L such that u~^L = Lg. 



3 Definition of Residual Finite State Automaton 

Definition 1. A Residual Finite State Automaton (RFSA)is aNFA A = {E, Q,Qo, F,S) 
such that, for each state q G Q, Lg is a residual language of La- More formally, 
Vq G Q,3u G E* such that Lg = u~^La- 

Remark: Trimmed DFA have this property, and therefore are RFSA. 

Definition 2. Let A = {E, Q, Qq, F, 6) be a RFSA and let q be a state of A. We say 
that u is a characterizing word for q if Lg = u~^La- 

Example 1. We study here the regular language L = E*0E where E = {0, 1}. One 
can prove that this language is recognized by the following automata Ai, A 2 and A 3 
(fig. 1, 2, 3): 

- Ai is a NFA recognizing L. One can notice that Ai is neither a DFA, nor a RFSA. 
Languages associated with states are Lg^ = E*0E, Lg^ = E, = {e}. As for 
every uin E*, we have uL C L and so, L C u~^L,we can see that neither L 2 nor 
Lq are residual languages. 

- A 2 is the minimal DFA that recognizes L. This automaton is also a RFSA , we have 

Lg^ = E*0E, Lg^ = E*flE + L;, = E*0E + E + e, Lg,^ = E*0E + s, so, 

Lqj = e~^L, Lqg = 0~^L, L^g = 00“^L, Lg^ = 01“^L. 

- A 3 is a RFSA recognizing L. Indeed, we have Lg^ = e~^L, L^g = 

01“ One can notice that this automaton is not a DFA. This automaton is the 
canonical RFSA of L, which is one of the smallest RFSA (regarding the number of 
states) recognizing L (the notion of canonical RFSA will be described later). 

Example 2. To look for a characterizing word for a state q is often equivalent to look 
for a word Ug that only leads to q (i.e. such that b(Qo 5 Ug) = {q}). Nevertheless, such a 
word does not always exist. For example, let L = a* b* + b* a*. 

b a a b 




Eig.4. A RFSA Recognizing the Language a*b* + b*a*. 
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The automaton described in figure 4 recognizes L. We have Lq^ = b*a*, Lq^ = a*, 
Lq^ = a*b*, Lq^ = b*. This automaton is a RFSA, as Lq^ = b~^L, Lq^ = {ba)~ L, 
Lq^ = a~^L, Lq^ = {ab)~^L. But there exists no word u such that m) = { 93 }. 

4 Properties of Residual Finite State Automata 

4.1 General Properties 

Definition 3. Let L be a regular language. We say that a residual language u~^L is 
prime if it is not equal to the union of residual languages it strictly contains: 

u~^L is prime if 

\^{v~^L I v~^L C u~^L} C u~^L. 

We say that a residual language is composed if it is not prime. 

Notice that a prime residual language is not empty and that the set of distinct prime 
residual languages of a regular language is finite. 

Proposition 1. Let A = {S, Q, Qq, F, S) be a RFSA. For each prime residual u~^La, 
there exists a state q G Q such that Lq = u~^La. 

Proof: Let 5{Qo, u) = {qi, . . . , < 7 ^} and let vi,. . . ,Vs be words such that = 
Vi~^LA for every 1 < i < s. We have 

u ^La = U 

t = l to S 



As u~^La is prime, there exists some Vi such that u~^La = v~^La = Lq^. □ 

As a corollary, a RFSA A has at least as many states as the number of prime resid- 
uals ofLyi- 

4.2 Saturation Operator 

We define a saturation operator that allows to add transitions to an automaton without 
modifying the language it recognizes. 

Definition 4. Let A = (A', Q, Qo, F^ S) be a NFA. We call saturated of A the automaton 
S{A) = {S,Q,Qo,F,S) with Qo = {q G Q \ Lq La} and6{q,x) = {q' G Q \ 
xLqi C Lq}. We say that an automaton A is saturated if A = S'(A). 



Lemma 1. Let A and A! be two NFA sharing the same set of states Q. If La = La> 
and if for every state q G Q, Lq = L'q (Lq and L'q being the languages corresponding 
to q in both automata), then S(A) = S(A'). 
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Proof: The state q is an initial state of S'(A) if and only if Lq C La, that is if and only 
if q is an initial state of 

In the same way, q' G 5{q, x) in S'(A) if and only if xLqi C Lq, i.e. if and only if 
q' & 5'{q,x) m S{A'). □ 

We note Lq = {u\ 5{q, u)C] F 0}. 

Proposition 2. Let Abe a NFA and let S{A) be its saturated. For each state q of A, we 
have Lq = Lq. 

Proof: Clearly, Lq C Lq as the saturated of an automaton is obtained by adding transi- 
tions and initial states. To prove the converse inclusion, we prove by induction that for 
every integer n and every state q 



LgHE^^ CLq. 

If n = 0, the property is true as A and «S'(Gl) have the same terminal states. Let 
u = XV G Lq D Z’-” with n > 1 and let q' G 6{q, x) such that v G Lqi. Because of 
our induction hypothesis, v G Lq>. As q' G S{q, x), we have xLq> C Lq and therefore 

XV G Lq. □ 

Corollary 1. Let Abe a NFA and 5'(A) be its saturated. Then A and S{A) recognize 
the same language and S {A) = S'(S'(A)). 

Proof: 

- We have L = \j{Lq\q G Qo} = U{Lq\q G Qo} = ^{Lq\q G Qo} which is equal 
to the language recognized by S'(A). 

- Due to the previous point and to the proposition 2, lemma 1 can be applied on A 
and S'(A) to prove that 5'(5'(A)) = 5'(A); the saturated of a saturated automaton is 
itself 

□ 



Corollary 2. If A is a RFSA then S'(A) is also a RFSA. 

Proof: The saturated of a RFSA is a RFSA as the saturation changes neither the lan- 
guages associated with the states nor the language recognized by the automaton. □ 

4.3 Reduction Operator cf> 

We define a reduction operator f that deletes states in an automaton without changing 
the language it recognizes. 

Definition 5. Let A = {E, Q, Qq, F, S) be a NFA, and let q be a state of Q. We note 
R{q) = {q' G Q\{q} \ Lqi C Lq}. We say that q is erasable in A if Lq = G 

R{q)}. 
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Ifq is erasable, we define 4>{A, q) = A = {S, Q', Qg, F' , 6') where: 

-Q' = Q\{?}, 

- Qo = Qo ifq ^ Qo, and Q'q = (Qo \ {?}) U R{q) otherwise, 

- F' = FHQ', 

- for every (f G Q' and every x G F 

( 6{q',x)ifq^S{q',x) 

I otherwise. 



Ifq is not erasable, we define 4>{A, q) = A. 

Let q' G Q be a state different from q. We note Lgi the language generated from q' 
in the automaton A and L'^, the language generated from q' in A! = 4>{A, q). 

Proposition 3. Let A be a NFA and let q be a state of A. The automata A and A' = 
4>{A, q) recognize the same language and for every state q' q, Lqi = L'^,. 

Sketch of proof: 

If q is not an erasable state, the proposition is straightforward. If q is an erasable 
state, we first prove that Lgi = L'^, using the faet that every path that allows to read 
a word u 'm A through q corresponds to a path in A' that uses an added transition and 
vice-versa. 

Finally, we prove that La = U,oeQo = (UgoeQJ, 

□ 



Proposition 4. The operator is an internal operator for the class ofRFSA. 

Proof: Neither the language recognized by a RFSA A nor the languages associated 
with its states are modified by the reduction operator f (c.f previous proposition). So, 
languages associated with states keep being residual languages of La- □ 

We prove now that saturation and reduction operators can be swapped. 

Lemma 2. Let A = {S, Q, Qq, F, S) be a NFA and let q be a state of Q. Then the 
automaton (f>(S{A), q) is saturated. 

Proof: We note L'^, (resp. Lgi) the language associated with a state q' in <j){S{A),q) 
(resp. in 5'(A)), b' (resp. <5) the transition function of (j){S{A),q) (resp. in S'(A)) and L 
the language recognized by the automata A, 5'(A) and <j){S{A),q). 

- If Lg, C L then Lgi C L and so q' is initial in S{A) and in (j}{S{A), q). 

- If xL'g, C L'g„ then xLgi C Lgii and so q' G S{q" , x) and q' G S'(q", x). 

□ 



Proposition 5. Let A = {F, Q, Qq, F, S) be a NFA and let q be a state ofQ. We have 

S{<P{^,q)) = f{S{A),q) 
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Proof: <p{A, q) and <j){S{A),q) have the same set of states. Furthermore, languages as- 
sociated with every state q' in fiA, q) and (j){S{A),q) are identical because of previous 
lemmas. Because of lemma 1, S{(j){A,q)) = S{<j){S{A),q)). As (j){S{A),q) is a satu- 
rated automaton (cf lemma 2), the proposition is proved. □ 

Definition 6. Let A be a NFA. If there is no erasable state in A, we say that A is re- 
duced. 

4.4 Canonical RFSA 

Definition 7. Let L be a regular language. VFe define A = {S, Q, Qq, F, S) the canon- 
ical RFSA of L in the following way: 

— E is the alphabet of L, 

— Q is the set of prime residuals of L, so Q = {u~^L \ u~^L is prime }, 

— its initial states are prime residuals included in L, so Qq = {u~^L G Q \ u~^L C 
L), 

— its final states are prime residuals containing the empty word, so F = {u ^L € 
Q \ e e u~^L}, 

— its transition function is 6{u~^L,x) = {v~^L G Q \ v~^L C (ux) ^L}. 

This definition assumes that the canonical RFSA is a RFSA, we will prove this 
presumption below. 

We have proved that the reduction operator transforms a RFSA into a RFSA, 
and that it could be swapped with the saturation operator. We prove now that, if A is 
a saturated RFSA, the reduction operator converges and that the resulting automaton is 
the canonical RFSA of the language recognized by A. 

Proposition 6. Let L be a regular language and let A = {E, Q, Qq, F, 6) be a reduced 
and saturated RFSA recognizing L. A is the canonical RFSA of L. 

Proof: As A is a RFSA, every prime residual u~^L of L can be defined as a language 
Lg associated with some states q G Q. As there are no erasable states in A, for every 
state q, Lg is a prime residual and distinct states define distinct languages. As A is 
saturated, prime residuals contained in L correspond to initial states of Qq. For the 
same reason, we can verify that the transition function is the same as in the canonical 
RFSA. □ 

Theorem 1. The canonical RFSA of a regular language L is a RFSA which recognizes 
L and which is minimal regarding the number of states. 

Proof: Let Aq, . . . , A„ be a sequence of NFA such that for every index i > 1, there 
exists a state qi of Ai_i such that Ai = qf). Proposition 5 and 6 prove that if 

Aq is a saturated RFSA and if A„ is reduced, then A„ is the canonical RFSA of the 
language recognized by Aq. 

So the canonical RFSA can be obtained from any RFSA that recognizes L using 
saturation and reduction operators. Proposition 1 proves that it has a minimal number 
of states. □ 

Remark that it is possible to find a RFSA that has as many states as the canonical 
RFSA of L, but fewer transitions. We have the following proposition: 
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Theorem 2. The canonical RFSA of a regular language L is the unique RFSA that has 
a maximal number of transitions among the set of RFSA which have a minimal number 
of states. 

Proof: Let A = {S,Q,Qq,F,6) be the canonical RFSA of a language L and let 
A' = (27, Q\ Qg, F' , 5') be a RFSA which has a minimal number of states. So, A' is 
reduced. From proposition 6, the saturated automaton of A' is A. Therefore, A' has at 
most as many transitions as A. □ 

5 Construction of the Canonical RFSA Using the Subset Method 

In the previous section, we provided a way to build the canonical RFSA from a given 
NFA using saturation and reduction operators. This method requires to check whether a 
language is included into another one and to check whether a language is composed or 
not. Those checks can be very expensive, even for simple automata. We present in this 
section another method which stems from a classical construction of the minimal DFA 
of a language and which is easier to implement. 

Let A = {E, Q, Qo, F, S) be a NFA. The subset construction is a classical method 
used to build a DFA equivalent to a given NFA. It consists in building the set of reach- 
able sets of states of A. We note Qr(a) = {p G 2*5 | G 27* s. t. d((5o, u) = p} and 
we define the subset automaton D{A) = {E,Qd,Qdo, Fd,Sd) with 

Qd = Qr{a), 

Qdo = {<5o}, 

Fd = {p & Qd \ pF F id}, 

Sd{p,x) = S{p,x). 

The automaton D{A) is a determi-nistic automaton that recognizes the same lan- 
guage as A. 

We remind that L (resp. E) denotes the mirror of a language L (resp. of an automa- 
ton E). The following result provides a method to build the minimal DFA of L. 

Theorem 3. [Brz62 ] Let L be a regular language and let E be an automaton such that 
E is a DFA that recognizes L. Then D(E) is the minimal DFA recognizing L. 

We can deduce from this theorem that D{D{A)) is the minimal DFA recognizing 
the language La- 

We adapt the subset construction technique to deal with inclusions of sets of states. 
We say that a state p G Qr(a) is coverable if there exist states pi G Qr(a), Pi ^ P, 
such thatp = UiPi. We define the automaton C{A) = (27, Qc, Qco, Fc^ Sc) by 

Qc = {f G Qr(a) I 

p is not coverable }, 

Qco = {f G Qc I P C Qo}, 

Fc = {p & Qc \pF F 

6c{p,x) = {p' G Qc I p' C S{p,x)}. 
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Lemma 3. Let A be a NFA, C{A) is a RFSA recognizing La such that all states are 
reachable. 

Sketch of proof: C{A) can be obtained from D{A) by using techniques which are 
similar to the ones used by the reduction operator. □ 

Theorem 4. Let L be a regular language and let B be an automaton such that B is 
a RFSA recognizing L such that all states are reachable. Then C{B) is the canonical 
RFSA recognizing L. 

Sketch of proof: 

Let qi e Qs, let Lq^ be the language associated with Qi in B and let Vi G S* be 
such that Lq. = vl~^L. Let p, p' G Qr(b)- We prove that: 

- Vi & Lp iff qi G p. 

- Lp C Lp' iffp C p' . 

- For every state p,pi,P 2 ■ ■ ■ Pn & Qr(b), Lp = Ui<fc<„Lp^ iffp = iJi<k<nPk- 

From the last three statements, we can prove that C{B) can be obtained from D{B) 
by reduction and saturation. As D{B) is deterministic, and using proposition 6, C{B) 

is the canonical RFSA of L. □ 

We can deduce from this proposition and from lemma 3 that C{C{A)) is the canon- 
ical RFSA of La. 

However, this construction also has some weaknesses. Indeed, it is possible to find 
examples for which C{A) has an exponential number of states with regard to the num- 
ber of states of A or C{C{A)). We can observe this situation with the mirror of the 
automaton used in the proposition 8. 

We can also observe that, if we are interested only in covering without saturation (if 
a state is covered, we delete it and we relead its transitions to covering states), we get a 
RFSA which has the same number of states (non-coverable states) and fewer transitions. 

6 Results on Size of RFSA 

We classically take the number of states of an automaton as a measure of its size. The 
canonical RFSA of a regular language has the size of the equivalent minimal DFA as an 
upper bound and the size of one of its equivalent minimal NFA as a lower bound. We 
show that both bounds can be reached even if there exists an exponential gap between 
these two bounds. 

Proposition 7. There exist languages for which the minimal DFA has a size exponen- 
tially larger than the size of the canonical RFSA, and for which the canonical RFSA 
has the same size as minimal NFA. 

Proof: languages, where n is an integer and B = {0, 1}, can illustrate this 

proposition. 

Residuals of L = B*0B"- are languages L U (Upep where PC {0, . . . , n}. 
One can observe that there exist 2"+^ distinct residuals. The minimal DFA recognizing 
this language has 2"+^ states. There exist only n 2 prime residuals: L, LU 27°, . . . , 
L U 27", so, the canonical RFSA of L has n -F 2 states. □ 
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Proposition 8. There exist languages for which the size of the canonical RFSA is ex- 
ponential with regard to the size of a minimal NFA. 

Proof: Let An = {S,Q,Qo,F,5)he. automata such that, for n > 1 

- L" = {a,b}, 

- Q = {qi \ ^ < i < n - 1}, 

- 5 is defined by 

5{qi, a) = Qi+i (for 0 < i < n - 1), 

b(g„_i,a) = qo, 

Hqo,b) = qo, 

6 {qi, b) = qi-i (for 1 < i < n) and 
S{qi,b) = qn-i, 

- Qo = {qi \ 0 < i < n/2}, 

- F = {go}. 

Figure 5 represents A 4 . 



b 




Fig. 5. An automaton n = 4, for which the Equivalent RFSA Is Exponentially 
Larger. 



The mirror automata A„ are trimmed and deterministic, thus we can apply theo- 
rem 4. The automata C(A„) are canonical RFSA. 

The initial state of the subset construction has n/2 elements. Moreover the reachable 
states are all the states with n/2 elements. So, none of them is coverable. 

The canonical RFSA C{A„) are exponentially larger than the initial NFA. □ 

Proposition 9. There exist languages for which the smallest characterizing word for 
some state has a length exponentially bigger than the number of states of the canonical 
RFSA. 

Sketch of proof: Let P = {pi , . . . , p„| be a set of n distinct prime numbers. We define 
the NFA Ap = (27, Q, Qo, F, 5) by: 

- S = {a}U {bp\p & P} 

- Q = {q/\p&P,Q<i<p} 

- Qo = {ql\p& P} 
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- F = Q^ 

- 5 is defined by: 

5{ql,a) = 

for 0 < i < p, p G P, 

for 0 < i < p — l,p,p' G P, 

b(Ci,v) = {<?n 

forp,y G P. 

The following results can be proved: 

- Ap is a RFSA. 

- The smallest characterizing word Uq of a state g G Q is such that |ttg| > IliPi 
which is exponential with regard to the size of Ap and therefore exponential with 
regard to the size of the canonical RFSA. 

□ 

Let A = {S, Q, Qo, F, S) be a RFSA and let g G Q such that Lg is prime. There 
must exist a smallest word u G Lg such that Lgi C Lg ^ u ^ Lgi . Next proposition 
proves that this word can be very long. 

Proposition 10. There exist languages for which the smallest word that proves that a 
state of the canonical RFSA is not composed has an exponential size with regard to the 
number of states of the minimal DFA. 

Proof: Let pi,...,p„ be distinct prime numbers. For each i, 1 < i < n, we note Li = 
{e} U {a^ I Pi is not a divisor of k}. Let bo, bn be distinct letters different from a. 
We consider the language L = boa* U (Ui<i<n hLi). 

We can easily build a minimal DFA for this language ; it contains ^Pi+n+2 states. 
The language b^^L = a* is not an union of residuals b~^L,i > 1 . But the shortest word 
that belongs to is and its length is exponential with regard 

to the size of the minimal DFA. □ 

7 Complexity Results about RFSA 

We have defined notions of RFSA, saturated automata, canonical RFSA ; in this section, 
we evaluate the complexity of our constructions and of decision problems linked to 
them: deciding if an automaton is saturated, building the canonical RFSA of a given 
language, and so on . . . 

Classical definitions about complexity can be found in [GJ79] and complexity re- 
sults about automata can be found in [HU79]. We present here simple complexity results 
about RFSA, proofs of which can be found in [DLTOOb]. 

The first notion that we defined is the notion of saturation. As one could guess, 
deciding if an automaton is saturated is easier for a DFA than for a NFA. 

Proposition 11. Deciding whether a DFA is saturated is a polynomial problem. On the 
other hand, deciding whether a NFA is saturated is a PSP ACE -complete problem. 
Building the saturated of a NFA is also a PSP ACE -complete problem. 
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The next proposition tells us that it is not practically possible, in the worst case, to 
check whether a NFA is a RFSA. 

Proposition 12. Deciding if a NFA is a RFSA is a PSP ACE -complete problem. 

Building the canonical RFSA equivalent to a given NFA is an exponential problem 
in general, as proved by proposition 8. The next proposition tells us that, even if the 
starting automaton is deterministic, this problem is PS'PAC'if-complete. The problem 
of deciding whether the saturated of a DFA is a canonical RFSA is also PSP ACE- 
complete. 

Proposition 13. Deciding if the saturated of a DFA is a canonical RFSA is a 
P S P AC E -complete problem. Building the canonical RFSA equivalent to a DFA is 
also a P S P AC E -complete problem. 



8 Comments and Conclusion 

Ideas developed in this paper come from a work done in the domain of Grammatical 
Inference. A main problem in this field is to infer efficiently (a representation of) a reg- 
ular language from a finite set of examples of this language. Some positive results can 
be proved when regular languages are represented by Deterministic Finite Automata 
(DFA). For example, it has been proved that Regular Languages represented by DFA 
can be infered from given data ([Gol78,Hig97]). In this framework, classical inference 
algorithms such as RPNI ([OG92]) need a polynomial number of examples relatively 
to the size of the minimal DFA that recognizes the language to be infered. So, regu- 
lar languages as simple as 27*027" cannot be infered efficiently using these algorithms 
since their minimal DFA have an exponential number of states. Hence, it is a natu- 
ral idea to try to use other kind of representations for regular languages, such as Non 
deterministic Finite Automata (NFA). Unfortunately, it has been proved that Regular 
Languages represented by NFA cannot be efficiently infered from given data ([Hig97]). 
We described in [DLTOOa] an inference algorithm (DeLeTe) that computes the canoni- 
cal RFSA of a target regular language from given data. Using this algorithm, languages 
such as 27*027" become efficiently leamable. So, introducing the class of RFSA in 
the field of grammatical inference seems to be a promising idea. However, we have to 
deal with the fact that most decision and construction problems linked to the class of 
RFSA are untractable in the worst case. What are the practical consequences of these 
worst-case complexity results ? Experiments we are currently leading in the field of 
grammatical inference let us think that they could be not too dramatic. 

While achieving this work, we have felt that RFSA was a class of automata worth 
being studied for itself, from a language theory point of view and this is what we have 
done in this paper. The class of RFSA has a very simple definition. It provides a de- 
scription level of regular languages which is intermediate between a representation by 
deterministic automata and a representation that uses the whole class of non determin- 
istic automata. RFSA shares two main properties with the class of DFA: the existence 
of a canonical minimal form and the fact that states correspond to natural component of 
the recognized language. Moreover canonical RFSA can be exponentially smaller than 
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the equivalent minimal DFA. All these properties show that the RFSA is an interesting 
class whose study must be carried on. 
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Abstract. We consider the problem of distributed deterministic broad- 
casting in radio networks. The network is synchronous. A node receives 
a message in a given round if and only if exactly one of its neighbors 
transmits. The source message has to reach all nodes. We assume that 
nodes do not know network topology or even their immediate neighbor- 
hood. We are concerned with two efficiency measures of broadcasting 
algorithms: its execution time (number of rounds), and its cost (number 
of transmissions). We focus our study on execution time of algorithms 
which have cost close to minimum. 

We consider two scenarios depending on whether nodes know or do not 
know global parameters of the network: the number n of nodes and the 
eccentricity D of the source. Our main contribution are lower bounds 
on time of low-cost broadcasting which show sharp differences between 
these scenarios. 



1 Introduction 

Radio networks have been extensively investigated by many researchers [1,3, 5, 6], 
[7,8,9,12,14,17,18,19]. A radio network is a collection of stations, called nodes, 
which are equipped with capabilities of transmitting and receiving messages. 
Every node can reach a given subset of other nodes, depending on the power 
of its transmitter and on the topography of the region. Hence a radio network 
can be modeled by its reachability graph in which the existence of a directed 
edge (u, v) means that node v can be reached from u. In this case u is called a 
neighbor of v. 

Nodes send messages in synchronous rounds measured by a global clock. In 
every round every node either transmits (to all nodes within its reach) or is silent. 
A node which is silent in a given round gets a message if and only if exactly one 
of its neighbors transmits in this round. If at least two neighbors of u transmit 
simultaneously in a given round, none of the messages is received by u in this 
round. In this case we say that a collision occurred at u. 
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** Andrzej Pelc was supported in part by NSERC grant OGP 0008136. 



A. Ferreira and H. Reichel (Eds.): STAGS 2001, LNCS 2010, pp. 158—169, 2001. 
@ Springer- Verlag Berlin Heidelberg 2001 




Deterministic Radio Broadcasting at Low Cost 



159 



There are two models studied in the literature which differ by specifying what 
exactly happens during a collision. The model with collision detection assumes 
that in this case the node at which collision occurred gets a signal different from 
the messages transmitted but also different from the background noise, and 
thus the node can deduce that more than one of its neighbors transmitted. An 
alternative model assumes no collision detection, i.e., supposes that the signal 
obtained as a result of collision is not different from the background noise, and 
thus nodes cannot distinguish multiple transmissions from no transmission. A 
comparative discussion justifying both models can be found in [3,15]. In this 
paper we use the model assuming no collision detection, as e.g., in [3, 7,8, 9]. 

Broadcasting is one of the basic tasks in network communication (cf. surveys 
[13,16]). One node of the network, called the source, has to transmit a message 
to all other nodes. Remote nodes are informed via intermediate nodes, along 
directed paths in the network. We assume that there exists a directed path from 
the source to any other node, and we restrict attention to such graphs only. 

One of the basic performance measures of a broadcasting scheme is the total 
time, i.e., the number of rounds it uses to inform all the nodes of the network. 
There is, however, another natural measure of efficiency of a broadcasting algo- 
rithm, and this is the total number of transmissions it uses. We call this num- 
ber the cost of the broadcasting scheme. Algorithms using few transmissions to 
broadcast in a radio network are less expensive to run. Apart from that they 
may permit portions of the network which are remote from currently transmit- 
ting nodes to carry out simultaneously non broadcast related transmissions. 

The aim of this paper is to study broadcasting algorithms working in radio 
networks of unknown topology, as in [3, 7, 8, 9]. As opposed to [3, 7, 8, 9], we consider 
not only the time but also the cost of broadcasting. The main subject of our 
study is execution time of low-cost algorithms, i.e., those whose cost is close to 
minimum. 

We assume that nodes do not have any knowledge of network topology, and 
that local knowledge of every node is limited to its own label. For n-node net- 
works we assume that labels are distinct integers from the set {0, ..., n— 1} but all 
our results remain valid if labels are from the set {0, ..., N}, where N G 0(n). (It 
is well known that radio broadcasting cannot be carried out in the anonymous 
model, even in the 4-ring.) We consider two scenarios depending on whether 
nodes know or do not know global parameters of the network: the number n 
of nodes and the eccentricity D of the source (i.e., the maximum length of all 
shortest paths from the source to all other nodes). Since these parameters may 
be unknown to nodes, the broadcasting process may be finished but nodes may 
be unaware of this. In fact it was proved in [7] that broadcasting with acknowl- 
edgement is impossible in unknown networks without collision detection. Conse- 
quently, we define time of broadcasting without requiring that nodes know that 
the process is terminated, similarly as in [7]. A broadcasting algorithm works in 
t rounds on a network G, if t is the minimum integer such that after round t 
all nodes of G know the source message, and no messages are transmitted after 
round t. Likewise, an algorithm has cost c for a network G, if c is the minimum 
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integer such that all nodes get the source message after c transmissions and no 
more transmissions are executed when the algorithm is run on G. 

Since our algorithms run on arbitrary unknown networks with n nodes and 
eccentricity D of the source, we are interested in their worst-case performance 
on the class of all networks with these parameters. Consequently, we define the 
time (resp. cost) of an algorithm for networks with parameters n and D as the 
maximum time (resp. cost) of this algorithm over all networks in this class. 

1.1 Related Work 

In much of the research on broadcasting in radio networks [1,3,5,6,14,18] the 
network is modeled as an undirected graph, which is equivalent to the assumption 
that the reachability graph is symmetric. The focus of research in these papers 
is broadcasting time, and more precisely, finding upper and lower bounds on 
it, under the assumption that nodes have full knowledge of the network. In [1] 
the authors proved the existence of a family of n-node networks of radius 2, for 
which any broadcast requires time l7(log^n), while in [14] it was proved that 
broadcasting can be done in time 0{D + log^n), for any n-node network of 
diameter D. In [17] the authors discussed broadcasting time in radio networks 
arising from geometric locations of nodes on the line and in the plane, under the 
assumption that some of the nodes may be faulty. 

In the above papers, the topology of the radio network was known in advance, 
and broadcasting algorithms were deterministic and centralized. On the other 
hand, in [3] a randomized protocol was given for arbitrary radio networks of 
unknown topology. This randomized protocol runs in expected time 0{Dlogn + 
log^n). In [18] it was shown that for any randomized broadcast protocol and 
parameters D and n, there exists an n-node network of diameter D requiring 
expected time Q{D\og{n/ D)) to execute this protocol. 

These results suggest an interesting question concerning the efficiency of de- 
terministic broadcasting algorithms working in networks of unknown topology. 
The first paper to deal with this scenario was [3]. The authors showed that 
any such algorithm requires time fi(n) for some symmetric network of constant 
diameter. In [12] fast deterministic broadcasting algorithms were given for net- 
works of unknown topology but of a very restricted class: the authors assumed 
that nodes are located in unknown points of the line, and every node can reach 
all nodes within a given radius from it. 

Deterministic broadcasting in arbitrary radio networks of unknown topology 
was first investigated in [7]. The authors showed an algorithm working in time 
0(n^^/®) and established a lower bound I7(n log n) on broadcasting time (cf. [4] 
where this lower bound was earlier proved in a slightly different setting). Then 
a series of faster broadcasting algorithms have been proposed, with execution 
times 0(n®/^(logn)^/^) [11], 0{n^/'^\/\ogn) [20], [8], and O(nlog^n) 

[9]. In all these papers time was the only considered measure of efficiency of 
broadcasting. However, in [7] a very simple and slower broadcasting algorithm 
was also proposed. Its execution time is O(n^) but from our point of view it has 
an additional advantage: its cost is minimum, i.e., n. 




Deterministic Radio Broadcasting at Low Cost 



161 



To the best of our knowledge, relations between time and cost of broad- 
casting in radio networks have never been studied previously. However, cost 
of communication measured in terms of the number of transmissions has been 
widely studied for point-to-point networks, mostly in the context of gossiping, 
i.e., of all-to-all communication (cf. [16] and the literature therein). It should 
be stressed that, unlike in radio networks, where a node transmits to all nodes 
within its reach, transmissions in point-to-point networks occur between spe- 
cific pairs of nodes. Tradeoffs between time and cost of communication in such 
networks were studied, e.g., in [10]. On the other hand, cost of broadcasting in 
point-to-point networks in which nodes know only their neighborhood, was the 
subject of [2]. 

1.2 Our Results 

While we assume that nodes do not have any knowledge of network topology, 
we consider two scenarios depending on whether they know or do not know 
global parameters of the network: the number n of nodes and the eccentricity 
D of the source. We show that the minimum cost of broadcasting in an n-node 
network of unknown topology is n, if at least one of the above parameters is 
unknown, and it is n — 1, if both of them are known. Our main contribution 
are lower bounds on time of low-cost broadcasting which show sharp differences 
between these scenarios. We show that if nodes know neither n nor D then 
any broadcasting algorithm whose cost exceeds the minimum by O(n^), for any 
constant /3 < 1, must have execution time fl(Dnlogn) for some network. We 
also show a minimum-cost algorithm that does not assume knowledge of these 
parameters, and works always in time 0{Dnlogn). On the other hand, assuming 
that nodes know either n or D, we show how to broadcast in time 0{Dn). This 
time cannot be improved by any low-cost algorithm even knowing both n and 
D. Indeed, we show that any algorithm whose cost exceeds the minimum by 
at most an, for any constant a < 1, requires time Hence we obtain 

asymptotically tight bounds on time of low-cost broadcasting under these two 
scenarios, and we show that knowing at least one of the global parameters n or 
D results in faster low-cost broadcasting than when none of them is known. 

In addition, we show that very fast broadcasting algorithms must have high 
cost. We prove that every broadcasting algorithm that works in time 0(jit(ji)), 
where t(n) is polylogarithmic in n, requires cost l7(nlogn/loglogn). Since the 
fastest known algorithm works in time O(nlog^n) [9], its cost (as well as the 
cost of any faster broadcasting algorithm, if it exists) must be higher than linear. 

2 Minimum-Cost Broadcasting 

In this section we establish asymptotically tight upper and lower bounds on 
execution time of minimum-cost broadcasting algorithms in two situations: (1) 
when nodes know either the number n of nodes or the eccentricity D of the 
source, and (2) when nodes do not know any of these parameters. It turns out 
that optimal broadcasting time is different in each of those cases. 
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2.1 Lower Bounds on Cost 

We first establish lower bounds on the cost of any broadcasting algorithm work- 
ing without knowledge of topology. The cost of all algorithms presented in this 
section will match these lower bounds and thus these are minimum-cost algo- 
rithms. The minimum cost turns out to be n if at least one of the parameters 
n or H is unknown to nodes, and it is n — 1, if both of them are known. The 
proofs of these lower bounds are omitted due to lack of space, and will appear 
in the full version of the paper. 

Theorem 1. If the eccentricity D of the source is unknown to nodes then every 
broadcasting algorithm requires cost at least n for some n-node networks, for any 
n > 2. 



Theorem 2. If the size n of the network is unknown to nodes then, for suffi- 
ciently large integers n, every broadcasting algorithm requires cost at least n for 
some n-node networks. 

If nodes know both the size n of the network and the eccentricity D of the 
source, we will see that cost can be lower but the following result shows that the 
gain can only be 1. 

Theorem 3. Every broadcasting algorithm requires cost at least n — 1 for some 
n-node networks, for any n > 2. 

2.2 Broadcasting Time with Known n or D 

We first consider the case when nodes of the network know at least one of the 
global parameters: the number n of nodes, or the eccentricity D of the source. 
We begin by presenting minimum-cost broadcasting algorithms working under 
these assumptions. We present three different algorithms, depending on whether 
nodes know only n, only D, or both these parameters. 

The simplest case occurs when nodes know the size n of the network but do 
not know the eccentricity D of the source. By Theorem 1 the lower bound on 
cost is n in this case, and broadcasting can be performed in time 0{Dn) using 
the following simple algorithm in which every node transmits exactly once, i.e., 
at cost n. Therefore this is a minimum-cost algorithm. 

Algorithm Only- Size- Known 

The source transmits in the first round. A node with label I transmits the 
source message in the first round r after it has received the message and for 
which I = r (mod n), and then stops. □ 

When only the eccentricity of the source is known, the situation is slightly 
more complicated. In this case the lower bound on cost is also n, by Theorem 2. 
The following algorithm performs broadcasting with the same efficiency as above 
(time 0{Dn) and cost n). The algorithm probes for n by repeatedly doubling 
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the range of labels of nodes allowed to transmit. Again every node transmits 
exactly once. (Compare [7] where this technique was used with both n and D 
unknown, resulting in time 0{n^).) 

Algorithm Only-Eccentricity-Known 

The source transmits in the first round. The algorithm works in stages. Stage 
i consists of D2® rounds. In every stage enumerate rounds from 0 to D2* — 1. A 
node with label I transmits the source message in the first round r, after it has 
received the message and for which I = r (mod 2*), and then stops. □ 

When both parameters D and n are known, the lower bound on cost is 
n — I (cf. Theorem 3), and the above algorithms which work at cost n are not 
minimum-cost. However Algorithm Only-Size-Known can be modified, so as to 
reduce cost by 1 on any n-node network with source eccentricity D. 

Algorithm Both-Known 

The source transmits in the first round. All other rounds are partitioned into 
consecutive disjoint segments of length n. Rounds in each segment are numbered 
0,...,n — 1. A node with label I that gets the source message for the first time in 
some round of segment i < D — 1, transmits it in round I of segment i + 1, and 
then stops. □ 

Theorem 4. Algorithms Only-Size-Known, Only-Eccentricity-Known, and 
Both-Known complete broadcasting at minimum cost, corresponding to their re- 
spective assumptions. They all work in time 0{Dn), in any n-node network with 
eccentricity D of the source. 

We now show that broadcasting at minimum cost cannot be performed 
asymptotically faster than done by the above algorithms, if either n or I? is 
known to nodes. The result holds also when both parameters are known. 

Theorem 5. Suppose that nodes know either the number n of nodes or the 
eccentricity D of the source. Every minimum-cost broadcasting algorithm re- 
quires time n{Dn) for some n-node networks with eccentricity D of the source, 
if l<D<n — 2. 

Proof. Fix a minimum-cost broadcasting algorithm A working under the as- 
sumption of the theorem. (Recall that the minimum cost is either n or n — 1, 
depending on the knowledge of nodes.) In the rest of the argument we consider 
only n-node networks with eccentricity D of the source in which exactly one 
node is at distance D from the source. We will construct such a network G for 
which algorithm A requires time Q{Dn). Fix a source v\. It will have indegree 
0. Assume without loss of generality that v\ transmits in the first round. 

Consider any node u ^ v\ and any network with source v\ for which v\ is 
the unique neighbor of u. The behavior of u is the same for all these networks. 
If some such node u never transmits, the algorithm is incorrect. Indeed, there 
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exists a network in the considered class, in which some node w has u as its unique 
neighbor. This node w could not get the source message. On the other hand, if 
some such node u transmits more than once, we get a contradiction with the 
fact that A is a minimum-cost algorithm. 

Hence, in any network with source vi for which v± is the unique neighbor 
of u, node u transmits exactly once. Consider two such nodes ui and U 2 - Their 
unique transmission must occur in two distinct rounds. The argument is the 
same as in the proof of Theorem 2. 

Let V 2 be the node such that the round t in which V 2 transmits if its unique 
neighbor is the source, is the latest among all remaining nodes. Add this node 
V 2 and add the edge (vi, U2). V 2 will not have any other neighbors. By definition, 
node V 2 transmits no sooner than in round 1-1- (n—1). Next we pick node V 3 among 
all remaining nodes in the same way (replacing v± by V 2 in the construction), 
and we add edge (v 2 ,V 3 ). Node V3 transmits no sooner than in round 1 -I- (n — 
1) -|- (n — 2). We continue with nodes U4 , ...,vd+i in a similar manner. Finally, 
we attach all remaining nodes directly to the source (i.e., we add edges (vi,Vj), 
for j = D + 2, thus creating an n-node network with eccentricity D. The 
time required by algorithm A on this network is at least 1 -I- (n — 1) -I- (n — 2) -|- 
... + {n — D) & Q{Dn). □ 

2.3 Broadcasting Time with Unknown n and D 

The following algorithm performs broadcasting when n and D are both unknown. 
This is done by repeatedly increasing Dn by a factor of 4, and for each value 
of Dn attempting to broadcast for different ratios of n/D. (A similar algorithm 
has been independently proposed in [20].) 

Algorithm Unknown 

The algorithm is divided into stages. Stage i consists of i+1 phases numbered 
0 to i. Phase j of stage i has 2^*“^ x 2-1 = 2^® rounds. The rounds of phase j of 
stage i are each assigned to one label. The fcth round of the phase is assigned 
to label k mod A node v transmits the source message once, in the first 

round assigned to the label of v after it receives the source message. □ 



Theorem 6. Algorithm Unknown performs broadcasting in 0{Dnlogn) rounds, 
at (minimum) cost n. 

The following lower bound on time shows that Algorithm Unknown is asymp- 
totically optimal among minimum-cost broadcasting algorithms, when parame- 
ters n and D are unknown to nodes. 

Theorem 7. Every minimum-cost broadcasting algorithm requires f2{Dnlogn) 
rounds for some n-node networks with eccentricity D of the source, when both n 
and D are unknown to nodes. 
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Proof. Fix a minimum-cost algorithm A and an even positive integer i. We 
will construct an n-node network G with eccentricity D of the source and with 
Dn G 0(2*), on which the algorithm performs broadcasting in time f2(Dnlogn). 
Since D is unknown, every node must transmit at least once, as shown in the 
proof of Theorem 1. In fact, every node must transmit exactly once, for otherwise 
cost n would be exceeded. 

If for a network H the algorithm schedules two distinct nodes u and v to 
transmit in the same round, construct a network H' by adding to H a, new node 
w and edges (u,w) and (v,w). u and v behave identically when A is run on H 
and on H' . In the unique round in which u and v transmit, a collision occurs at 
w, and hence w cannot get any message, as u and v are its unique neighbors. 
Hence all nodes must transmit in separate rounds. 

We now proceed with the construction of network G. The node with label 0 
will be the source with indegree 0. Partition all integers between 1 and 2* — 1 
into t/2 disjoint consecutive segments S'j/ 2 + 1 , ...,5'i, such that 2^“^ < |S'j| < 2G 
For each set Sj consider all networks whose set of labels is Sj U {0}. For all of 
these networks compute the set of rounds in which some node of the network 
apart from the source transmits. Let Rj be the union of the sets of rounds 
corresponding to all networks with set of nodes Sj U {0}. The sets Rij 2 +i, ■■■, Ri 
must be pairwise disjoint because two nodes cannot transmit simultaneously. 
Theorem 5 (with suitable relabeling of nodes and renumbering of rounds) shows 
that , for any j = t/2 -|- 1 , . . . , i , there exist a network Gj with set of nodes S'j U { 0 } , 
and eccentricity of the source Dj G 0(2*“-l), on which algorithm A works in 
17(2*) rounds from Rj. Let Xj denote the set of all rounds from Rj which are 
less than or equal to the last round in which A schedules a transmission when 
run on network Gj. Hence the sizes of all sets Xj are 17(2*), and all these sets are 
pairwise disjoint, as subsets of sets Rj. Consequently, some Xj must contain a 
round 17(i2*). This means that algorithm A requires time 17(i2*) on the respective 
network Gj. 

Now we construct G by augmenting Gj as follows. Attach all nodes with 
labels from all sets Sj>, j' < j, directly to the source of Gj. The resulting 
network G has size n < and nodes are numbered by consecutive integers. 
The eccentricity of the source remains unchanged: D = Dj G 6>(2*“1). Hence 
2* G f2{Dn) and i G 17(logn), and consequently A requires time f2{Dnlogn) on 
G. □ 



3 Broadcasting at Low Cost 



In this section we generalize Theorems 5 and 7 by proving the respective lower 
bounds on time for a larger class of algorithms: not only those with minimum 
cost but for all algorithms whose cost is close to minimum. This shows that 
in order to decrease broadcasting time, cost of broadcasting must be increased 
significantly. In the proofs we indicate how our previous arguments should be 
extended in this more general case. 
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Theorem 8. Every broadcasting algorithm with cost at most an, for a constant 
a < 2, works in time f2{nD) for some n-node networks with eccentricity D of 
the source, whenever D < cn, for some constant c < 4 — 2a. 

Proof. Fix a broadcasting algorithm A and a constant a < 2. We will construct 
an n-node network G with eccentricity D of the source, for which the algorithm 
either works in time fi{nD) or works at cost larger than an. Fix a source v\. It 
will have indegree 0. Assume without loss of generality that v\ transmits in the 
first round. Let vi be the node such that the round ti in which V 2 first transmits 
the source message if its unique neighbor is vi, is the latest among all nodes 
different from v\. Add the node V 2 , and the edge (vi,V 2 ). 

If ti < 2n — an, add the edge (ui,n) for every other node u. We will show 
that for the resulting network G the algorithm has cost larger than an. Indeed, 
every node except the source transmits during rounds number 2 to 2n — an — I. 
For every round r in this interval, at most one node transmits in round r and in 
no other round. Otherwise two nodes transmit once and during the same round 
which is impossible (cf. the proof of Theorem 7). Hence, at most 2n — an — 1 
nodes can transmit once and the remaining an — n -I- 1 nodes must transmit at 
least twice. This results in cost larger than an. 

If t\>2n — an, i.e., V 2 transmits no sooner than in round 2n — an, let v^ be 
the node such that the round t 2 in which vs first transmits the source message 
if its unique neighbor is V 2 , is the latest among all nodes different from vi and 
V 2 . Continue the construction of G by adding node vs, and the edge {v 2 ,v^). 

Similarly as above, it is either possible to construct a network yielding cost 
larger than an by adding the edge (v 2 ,u) for every remaining node u, or v^ 
transmits the source message no sooner than in round 4n — 2an — 2. In this way, 
if cost does not exceed an, we can construct a directed path of length D, for 
which the algorithm requires more than {D — l){2n — an) — D{D — 1) /2 G f2{nD) 
rounds. (Here we use the assumption that D < cn, for some c < 4 — 2a). The 
remaining n — D — 1 nodes should be attached directly to the source to produce 
an n-node network with eccentricity D of the source, on which the algorithm 
requires the above number of rounds. □ 

Theorem 9. Every broadcasting algorithm with cost less than n + n^ , for a 
positive constant j3 < 1, works in time f2{Dnlogn) for some n-node networks 
with eccentricity D of the source, when n and D are unknown to nodes. 

Proof. Fix a broadcasting algorithm A and an even positive integer i. We 
will construct an n-node network G with eccentricity D of the source and with 
Dn G 0(2*), on which the algorithm performs broadcasting in time l7(Onlogn), 
or a network El with n' = 2 *^ 2 +<^) nodes (e is a positive constant < 1/2 to be 
determined later), on which the algorithm requires cost n' -\- n'^ . Since D is 
unknown, every node must transmit at least once, as shown in the proof of 
Theorem 1. 

The node with label 0 will be the source with indegree 0. Similarly as in 
the proof of Theorem 7, partition all integers between 1 and 2* — 1 into ei 
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disjoint consecutive segments Si/ 2 + 1 , such that < |S'j| < 2^. For 

each set Sj consider all networks whose set of labels is Sj U {0}. For all of these 
networks compute the set of rounds for which there exists a node v different from 
the source such that this round is the only one when v transmits. (Notice the 
difference from the construction in the proof of Theorem 7.) Let Rj be the union 
of the sets of rounds corresponding to networks with the set of nodes 5*^ U {0}. 
The sets Ri/ 2 + 1 , ■■■, Ri{i+e) must be pairwise disjoint because two nodes that 
transmit exactly once cannot transmit simultaneously, by the same argument as 
in the proof of Theorem 7. 

For every t/2 < j < i{^ + e) we construct a network Gj with the set of nodes 
Sj U {0}. The network Gj consists of a directed path P of length Dj G 0(2*“-’), 
in which the source is the first node, and of nodes outside of P having the 
source as their only neighbor and with outdegree 0. Pick the nodes 0, v\, ..., Vp in 
path P in order of increasing indices, in the following way. At any stage of the 
construction, if there is a choice of Vk such that it transmits more than once, if 
its unique neighbor is Vk-i, pick any such node. Otherwise, pick the node that 
transmits at the latest, if its unique neighbor is Vk-i, among all still available 
nodes. 

To find a lower bound on the time of algorithm A when run on Gj, we 
calculate the delay induced by each node of the path. We do not count delays 
caused by nodes picked for the first reason (transmitting more than once). A 
node picked for the second reason delays broadcasting by at least |S'j| — fc + 1 
rounds from the set Rj . Otherwise we would have a choice of a node transmitting 
more than once. Let X be the total sum over all networks Gi/ 2 + 1 , ■■■, of 

numbers of nodes transmitting more than once. 

If X > n'^ then the cost of the algorithm on network P[ which is the union 
of Gi/ 2 +i , ..., is at least n' + n'^. This network has n' = 2*^2+'^^ nodes. 

If, on the other hand, X < n'^ , then every network Gj has at least Dj — X 
nodes contained in the directed path, that transmit exactly once. The total delay 
of the algorithm on network Gj is thus at least 



{\Sj\ -X) + (I A, I - X - 1) + ... + {\S,\ - D, + 1) 



= {D, - X)(|5,| - X) - {D, - X){D, - X - l)/2 G Q{T), 

for e < 2 ( 1 +% ’ since \Sj\ G 0(2^) and Dj G 0(2*“-^). Let Xj denote the set 
of all rounds from Rj which are less than or equal to the last round in which 
A schedules a transmission when run on network Gj. Hence the sizes of all 
sets Xj are G(2*), and all these sets are pairwise disjoint, as subsets of sets Rj. 
Consequently, some Xj must contain a round C(i2*). This means that algorithm 
A requires time C(f2*) on the respective network Gj. 

Now we construct the network G by augmenting Gj, as in the proof of Theo- 
rem 7: attach all nodes with labels from all sets Sj+ j' < j, directly to the source 
of Gj. G has size n < 2^+^, and eccentricity D = Dj G 0(2*“-l) of the source. 
Since 2* G f2{Dn) and i G i7(logn), A requires time D{Dnlogn) on G. □ 
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4 Cost of Very Fast Broadcasting 

We finally show that very fast broadcasting algorithms (in particular the fastest 
known, working in time 0(nlog^ n)) must have cost higher than linear. 

Theorem 10. Lett{n) he any function poly logarithmic inn. Every broadcasting 
algorithm that works in time 0{nt{n)) requires cost l7(nlogn/loglogn) on some 
n-node networks. 

Proof. For simplicity assume that 8 divides n. Define the following n-node 
network G. The node with label 0 is the source of G. The remaining nodes will 
be divided into n/2 — 1 pairs. Enumerate the pairs 1, ..., n/2 — 1. The root is the 
only neighbor of nodes of the first pair, and the only neighbors of nodes of pair 
i + 1 are both nodes of pair i. The last node has the source as its only neighbor. 
No other connections exist in G. This completes the description of the topology 
of G. We will later assign labels to nodes, depending on the algorithm. 

Fix an algorithm A that performs broadcasting in time less than cnt{n), for 
a fixed constant c. Labels will be chosen for nodes in the order of increasing pair 
numbers. Fix a pair p < n/4. Both nodes of pair p receive the source message in 
the same round r. At this point in time all lower numbered pairs and no higher 
numbered pairs of nodes have received the source message. For every possible 
label of a node in pair p and for every round > r, algorithm A decides whether 
the node transmits or not. There are two possible cases. If there exists a pair of 
available labels (not used for lower numbered pairs) such that the behavior of 
nodes in pair p with these labels is identical up to round r + 8ct{n) (by identical 
behavior we mean that in each of these rounds either both nodes transmit or both 
remain silent) then assign these labels to the nodes of pair p. Otherwise assign 
the two available labels that will result in the highest number of transmissions 
in rounds r -|- -|- 8ct(n). 

If more than n/8 pairs among pairs 1,..., u/4, are assigned labels by the 
first choice (identical behavior for many rounds), the total broadcasting time 
exceeds cnt{n), which is a contradiction. Hence, at least n/8 pairs are assigned 
labels by the second choice (maximum number of transmissions). Fix a pair 
p' < n/4 whose nodes were assigned labels by the second choice. Associate with 
each available label a binary sequence of length 8ct(n), where 1 in position i 
represents the decision to transmit in round r + i and 0 represents the decision 
to keep silent. Let S be the set of those binary sequences. The chosen labels will 
be the labels associated with the binary sequences from S with most occurrences 
of I’s. All the binary sequences in S must be different, otherwise the first choice 
of labels would be possible for nodes of pair p' . 

Let X be the maximum number of I’s in a sequence of S. The number 
of binary sequences of length 8ct(n) with at most x I’s is < 

< x{8ct{n)Y. As the number of available labels is at least n/2 — 2, 
we have x{8ct{n))^ > n/2 — 2. Hence x G i7(logn/loglogn), because t{n) is 
polylogarithmic in n. Since at least n/8 pairs will be assigned labels by the sec- 
ond choice, and thus contribute i7(logn/loglogn) to the cost, the total cost is 
i7(nlogn/loglogn). □ 
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Abstract. This paper extends extends known results on the complexity 
of word equations and equations in free groups in order to include the 
presence of rational constraints, i.e., such that a possible solution has 
to respect a specification given by a rational language. Our main result 
states that the existential theory of equations with rational constraints 
in free groups is PSPACE-complete. 

Keywords: Formal languages, equations, regular language, free group. 



1 Introduction 

In 1977 (resp. 1983) Makanin proved that the existential theory of equations in 
free monoids (resp. free groups) is decidable by presenting algorithms which solve 
the satisfiability problem for a single word equation (resp. group equation) with 
constants [13,14,15]. These algorithms are very complex: For word equations the 
running time was first estimated by several towers of exponentials and it took 
more than 20 years to lower it down to the best known bound for Makanin’s 
original algorithm, which is to date EXPSPACE [7]. For equations in free groups 
Koscielski and Pacholski have shown that the scheme of Makanin is not primitive 
recursive. 

Recently Plandowski found a different approach to solve word equations and 
showed that the satisfiability problem for word equations is in PSPACE, [18]. 
Roughly speaking, his method uses data compression (first introduced for word 
equations in [19]) plus properties of factorization of words. Gutierrez extended 
this method to the case of free groups, [9] . Thus, a non-primitive recursive scheme 
for solving equations in free groups was replaced by a polynomial space bounded 
algorithm. 

In this paper we extend the results [18,9] above in order to include the pres- 
ence of rational constraints. Rational constraints mean that a possible solution 
has to respect a specification which is given by a regular word language. Our main 
result states that the existential theory of equations in free groups with rational 
constraints is PSPACE-complete. The corresponding PSPACE-completeness for 
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word equations with regular constraints has been announced by first Rytter, see 
[18, Thm. 1] and [20]. 

The idea to consider regular constraints in the case of word equations is due 
to Schulz [21]. The importance of this concept, pointed out firstly by Schulz, can 
be exemplified by: the application of Schulz’ result to monadic simultaneous rigid 
E-unification [6] ; the use of regular constraints in [5] as a basic (an necessary) tool 
when showing that Makanin’s result holds in free partially commutative monoids; 
the proof, in a forthcoming paper of Diekert and Muscholl, of the decidability of 
the existential theory of equations in graph groups (open problem stated in [5]) 
by using the present result; and the positive answer, by Diekert and Lohrey [4], 
to the question (cf [16]) about the existential theory of equations in free products 
of free and finite groups is decidable by relying on the general form of Theorem 2 
below (we allow fixed points for the involution on F). 

Our paper deals with the existential theory. For free groups it is also known 
that the positive theory without constraints is decidable, see [15]. Thus, one can 
allow also universal quantifiers but no negations. Note that we cannot expect 
that the positive theory of equations with rational constraints in free groups 
be decidable, since we can code the word case (with regular constraints) which 
is known to be undecidable. On the other hand, a negation leads to a positive 
constraint of a very restricted type, so it is a interesting question under which 
type of constraints the positive theory remains decidable. 

Our proof of Theorem 1 is in the first step a reduction to the satisfiability 
problem of a single equation with regular constraints in a free monoid with 
involution. In order to avoid an exponential blow-up, we do not use a reduction as 
in [15], but a much simpler one. In particular, we can handle negations simply by 
a positive rational constraints. In the second step we show that the satisfiability 
problem of a single equation with regular constraints in a free monoid with 
involution is still in PSPACE. We extend the method of [18,9] such that it copes 
with the involution and with rational constraints. There seems to be no direct 
reduction to the word case or to the case of free groups without constraints. So 
we cannot use these results as black boxes. Because there is not enough space to 
present the whole proof in this extended abstract, we focus on those parts where 
there is a substantial difference to the case without constraints. In particular, 
we develop the notion of maximal free interval, a concept which can be used 
even when there are no constraints, but when one is interested in other solutions 
rather than the one of minimal length. The missing proofs can be found in [10] 
which is available on the web.^ 

2 Equations with Rational Constraints in Free Groups 

Rational Languages, Equations. Let A be a finite alphabet and let A = { a | 
a G S}. We use the convention that a = a. Define F = 17 U A. Hence ~ : F ^ F 
is an involution which is extended to F* by oi • • • a„ = (hi • • • oT for n > 0 and 
Oi € F. We usually will write just F instead of (F,~). A word w G F* is freely 
reduced, if it contains no factor of the form ad with a G F. 

^ In http://inf.informatik.uni-stuttgart.de/ifi/ti/veroeffentlichungen/psfiles is the file 

HagenahDiss2000.ps 
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The elements of the free group F{E) are represented by freely reduced words 
in r* . We read a as a~^ in F{E). There is a canonical homomorphism": F* — > 
F(E), which eliminates all factors of the form aa from a word. 

The class of rational languages in F{E) is inductively defined as follows: 
Every finite subset of F{E) is rational. If Pi,P 2 C F{E) are rational, then 
Pi U P 2 , Pi ■ P 2 , and Pi are rational. Hence, P C F{E) is rational if and only if 
P = {w : w G P'} for some regular language P' Q F* . It is well-known that the 
family of rational group languages is an effective Boolean algebra, in particular, 
it is closed under complementation [1]. (See also [2, Sect. III. 2].) 

In the following 17 denotes a finite set of variables (or unknowns) and we let 
~ : 17 — > 17 be an involution without fixed points. An equation with rational con- 
straints in free groups is an equation IE = 1 in free groups plus constraints on 
the variables of the type X G P, for P a rational language. The existential frag- 
ment of these equations is the set of closed formulas of the form 3Xi . . . 3XnB, 
where Xi G FI and H is a Boolean combination of atomic formulas which are 
either of the form {W = 1) or {Xi G P), where W G {FU 17)* and P C F{E) is 
a rational language. The existential theory of equations with rational constraints 
in free groups is the set of such formulas which are valid in the free group F{E). 

Theorem 1. The existential theory of equations with rational constraints in free 
groups is PSPACF-complete. 

Proof (Sketch). The PSPACE-hardness follows easily from [12] and is not dis- 
cussed further. The proof for the inclusion in PSPACE is a reduction to the 
corresponding problem over free monoids with involution. It goes as follows. 

First, we may assume that the input is given by some propositional formula 
which is in fact a conjunction of formulae of type W = l, XgP, X^P 
with W G {FU 17)*, A S 17, and P C F{E) rational.^ This is achieved by using 
DeMorgan rules to push negations to the level of atomic formulas, then replacing 
IE yf 1 by 3X : WX = 1 A A ^ {1} (and pushing the quantifier to the out-most 
level), and finally eliminating the disjunctions by replacing non-deterministically 
every subformula of type A V i? by either A or B. 

It is not difficult to see that we may also assume that jlEj =3 (use the 
equivalence of = 1 and 3Y : X 1 X 2 Y = 1 A Exs • • • = 1). 

Finally, we switch to the existential theory of equations with regular con- 
straints in free monoids with involution. The key point of the translation here is 
the fact that rational languages P are in essence regular word languages over F 
such that P C N, where N C P* is the regular set of all freely reduced words. 
The language N is accepted by a deterministic finite automaton with |T| + 1 
states. Then a positive constraint has just the interpretation over words and for 
a negative constraint we replace X^PhyX^PAXGN. Details are left to 
the reader. 

As for the formulas xyz = 1, note that they have a solution if and only if they 
have a solution in freely reduced words. Then we can replace each subformulae 
xyz = 1 by the conjunction 3P3Q3R : x = PQAy = QRAz = RP using simple 
arguments. 

^ The reason for keeping X ^ P instead of A £ P where P = F{E) \ P is that 
complementation may involve an exponential blow-up of the state space. 




The Existential Theory of Equations with Rational Constraints 



173 



Using a standard procedure to replace a conjunction of word equations by a 
single word equation we may assume that our input is given by a single equation 
L = R with L,R £ (U U and by two lists {Xj £ Pj,l < j < m) and 
{Xj ^ Pj,rn < j < k) where each Pj C P* is specified by some non-deterministic 
automaton Aj = (Qj, P, 5j , Ij ,Fj). 

The question is whether the input is satisfiable, i.e. whether there is a so- 
lution. At this point, Boolean matrices are a better representation than fi- 
nite automata. Let Q be the disjoint union of the state spaces Qj, assume 
Q = n}. Let S = then S C Q x P x Q and with each a £ P 

we can associate a Boolean matrix g{a) £ such that g(a)ij is the truth 

value of “(z,a,J) G S". 

Since our monoids need an involution, we will work with 2n x 2n-Boolean 
matrices. Henceforth Af denotes the following monoid with involution, 

M = { (q I A,BgB"X"} 



where 



0 



0 



and where the operator ^ means transposition. 



We define a homomorphism h : P* 



M by h{a) = ( 



fora G T, 



0 

g{aV 

where the mapping g : P B"^" is defined as above. The homomorphism h can 
be computed in polynomial time and it respects the involution. Now, for each 
regular language Pj we compute vectors Ij , Fj £ B^” such that for all w £ F* we 
have the equivalence: w £ Pj ^ lJh{w)Fj = 1. Having done these computations 
we make a non-deterministic guess p{X) £ M for each variable X £ 12. We verify 
p{X) = p{X) for all A G 17 and whenever there is a constraint of type X £ Pj 

lJp{X)F, = 0). 

N. We consider an equation of 
the length d over some P and f2 with constraints in M being specified by a list 
E containing the following items: 



(resp. X ^ Pj) then we verify ij p{X)Fj = 1 (resp. 
Let us make a formal definition. Let d, n 



— The alphabet {P, ) with involution. 

— A mapping h : P ^ M such that h{ji) = h{a) for all a £ F. 

— The alphabet (17, ~) with involution without fixed points. 

— A mapping p ■. f2 ^ M such that p{X) = p{X) for all A G 17. 

— The equation L = R where L,R £ (T U 17)+ and \LR\ = d. 



If no confusion arise, we will denote this list simply by 



E={F,n,h,p,L,R). 

A solution is a mapping a : f2 ^ F* (being extended to a homomor- 
phism a : {P U 17)* — > F* by leaving the letters from F invariant) such that 
the following three conditions are satisfied: cr(L) = (j{R), cr(A) = cr(A), and 
ha{X) = p{X) for all A G 17. We refer to the list E as an equation with con- 
straints (in M). By the reduction above. Theorem 1 is a consequence of: 

Theorem 2. The following problem can he solved in PSPACE. 

INPUT: An equation Eq = (Iq, 17q, ho, po, Lq, Rq). 

QUESTION: Is there a solution a : f2 ^ P* ? 
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3 Equations with Regular Constraints over Free Monoids 
with Involution 

During the procedure which solves Theorem 2 one has to consider various other 
equations with constraints in M . Following Plandowski we will use data com- 
pression for words in (T U f2)* in terms of exponential expressions. 

Exponential Expressions. Exponential expressions (their evaluation and their 
size) are inductively defined as follows: 

— Every word w G F* denotes an exponential expression. The evaluation 
eval(t(;) is equal to w, its size llicH is equal to the length licl. 

— If e, e' are exponential expressions, so is ee', the evaluation is the concate- 
nation, eval(ee') = eval(e)eval(e'), and ||ee'|| = ||e|| + ||e'||. 

— If e be an exponential expression and A: C N, then (e)^ is an exponential 
expression, and eval((e)*) = (eval(e))^ and ||(e)*|| = log(fc) + ||e||. 

It is not difficult to show that the length of eval(e) is at most exponential in 
the size of e. Moreover, let u G F* he a factor of a word w G F* which can be 
represented by some exponential expression of size p. Then we find an exponential 
expression of size at most 2p^ that represents the factor u. 

We say that an exponential expression e is admissible, if its size ||e|| is 
bounded by some fixed polynomial in the input size of the equation Eq. Let 
E = {F, [2, h, p, L, R) and eL,en be exponential expressions with eval(ei) = L 
and eval(eij) = R. We say that E^ = {F, f2, h, p, Cl, 6_r) is admissible, if clGr is 
admissible, |T \ /o| < ||eLei?|| +2d, 17 C 17q, and h{a) = ho{a) for a G F D Fq. 
We say that E^ represents the equation E. For two admissible equations with 
constraints E and E' we write E = E' , if E and E' represent the same object. 

Because of regular constraints, we have to formalize carefully the basic op- 
erations over these equations in order to move from one equation to another. 
Base Changes. Let E' = {F' ,f2,h' , p,L' ,R') be an equation. A mapping (3 : 
F' ^ F* is a base change if both /3(a) = /3(a) and h'{a) = h(3{a) for all a G 
F' . The new equation is /3*(E') = {F, fi,h, p, fi{L), (3{R)) . We say that (3 is 
admissible if IT U T'| has polynomial size and if for each a G F' , (3{a) has an 
admissible exponential representation. 

li (3 : F' ^ F* is an admissible base change and if L' = R' is given by a pair 
of admissible exponential expressions, then we can represent (3,t{E') by some 
admissible equation with constraints which is computable in polynomial time. 

Lemma 1. Let E' be an equation with constraints in M and /3 : F' ^ F* be a 
base change. If a' \ F2 ^ F'* is a solution of E' , then u = j3a' : [2 ^ F* is a 
solution of /3* (E') . 

Projections. Let T C T' be alphabets with involution. A projection is a ho- 
momorphism 7T : F'* —>■ F* preserving the involution and leaving F fixed. If 
: T — > M is given, then a projection tt defines also h' : F' ^ M hy h' = hir. 
For an equation E = {F, h, F2, p, L, R) we define Tf*{E) = {F' , hn, f2, p, L, R). 
Note that every projection tt : F'* — > F* defines also a base change tt* such that 

7r*7T*(i3) = E. 
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Lemma 2. Let F C F' be as above and let E = {F, Q,h, p, L, R) and E' = 
{F' , L2,h' , p, L, R). Then there is a projection tt : F'* F* such that tt*{E) = 
E' , if and only if both h'(F') C h{F*) and a = a implies h'{a) G h{{w G F* \ 
w = w}) for all a G F' . Moreover, if u' is a solution of E' , then we effectively 
find a solution a for E with |cr(L)| < 2 |M||ct'(L)|. 

Lemma 2 says that in order to test whether there exists a projection tt : 
F'* — > F* such that tt*{E) = E' , we need only space to store some Boolean 
matrices of B 2 nx 2 n^ need an explicit description of tt : F'* F* 

itself. Only if n becomes a substantial part of the input size, then we might need 
the full power of PSPACE (PSPACE-hardness of the satisfiability problem). 

Shifts. Let 17' C 17 be a subset of the variables which is closed under involution, 
and let p' : 17' — > M with p'{x) = p'{x) (we do not require that p' is the 
restriction of p). A shift is a mapping 6 : FI F* [2' F* U F* such that the 

following conditions are satisfied: 

i) 6{X) G F*XF* for all A G 17', 

ii) 5(A) G F* for all A G 17 \ 17', 

hi) 5(A) = 5(A) for all A G 17. 

The mapping 5 is extended to a homomorphism 5 : (T U 17)* ^ {F U 17')* by 
leaving the elements of F invariant. For and equation E = {F,h, fi, p, L, R), 
we define the equation 5*(E) = {F, f2' , h, p' ,6{L),6{R)) where p' is such that 
p(A) = h{u)p'{X)h{v) for 5(A) = uXv, and p(A) = h{w) for 5(A) = w G F*. 
We say that 5* (A) is a shift of E. 

Lemma 3. In the notation of above, let E' = 6^,{E) for some shift 6 : F2 ^ 
F*QF* U A*. // cr' : 17' ^ F* is a solution of E' , then a = a'S : F2 ^ F* is a 
solution of E. Moreover, we have <j{L) = a'{L'). 



Lemma 4. The following problem can be solved in PSPACE. 

INPUT: Two equations with constraints E and E' . 

QUESTION: Is there some shift 5 : 17 — > F* I2F* U F* such that S),{E) = E' ? 

Moreover, if 5i,{E) = E' , then we have 5(A) = eval(e„)Aeval(e„) for all 
A G 17' and for suitable admissible exponential expressions e„,e„. Similarly, 
5(A) = eval(eu,) for all X G F2 \ FI' . 



Remark 1. We can think of a shift 5 : 17 — > F* F2' F* U T* as a partial solution 
in the following sense. Assume we have an idea about cr(A) for some A G 17. 
Then we might guess cr(A) entirely. In this case we can define 5(A) = ct(A) 
and we have A ^ 17'. For some other A we might guess only some prefix u and 
some suffix v of cr(A). Then we define 5(A) = uXv and we have to guess some 
p'(A) G M such that p{x) : h{u)p' (X)h{v). If our guess was correct, then such 
p'(A) must exist. We have partially specified the solution and we continue this 
process by replacing the equation L = Rhy the new equation 5(L) = 6{R). 
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4 The Search Graph and Plandowski’s Algorithm 

The nodes of the search graph are admissible equations with constraints in M . 
Let E, E' be two nodes. We define an arc E ^ E' , if there are a projection tt, 
a shift 5, and an admissible base change /3 such that 5t{Tr*{E)) = (}^{E'). 

Lemma 5. The following problem can he decided in PSPACE. 

INPUT: Admissible equations with constraints E and E' . 

QUESTION: Is there an arc E ^ E' in the search graph? 

Proof. (Sketch) We first guess some alphabet {P” ,~) of polynomial size together 
with h” : r" — > M . Then we guess some admissible base change fi : P' ^ P"* 
such that h' = h" (3 and we compute (3*{E') in polynomial time. Next we check 
using Remark 1 and Lemma 4 that there is projection tt : P" P and that 
there is a shift : i? — > P"* O' P"* U P"* such that {E)) = (3^{E'). □ 

Plandowski’s algorithm works on Eq = (Iq, Ao, Po, Ro) as follows: 

1 . E := Eq 

2. while i7 yf 0 do 

Guess an admissible equation E' with constraints in M . 

Verify that E ^ E' is an arc in the search graph. 

E := E' 

3. return ”eval(ei) = eval(e/{)” 

By Lemmata 1, 2, and 3, if if ^ E' is an arc in the search graph and E' is 
solvable, then E is solvable, too. Thus, if the algorithm returns true, then Eq 
is solvable. The proof of Theorem 2 is therefore reduced to the statement that 
if Eq is solvable, then the search graph contains a path to some node without 
variables and the exponential expressions defining the equation evaluate to the 
same word (called a terminal node) . 

Remark 2. If A ^ E' is due to some tt : P"* ^ P*, 6 : O ^ P"*0'P"* U P"*, 
and P : P'* P"* , then a solution cr' : f?' — > P'* of E' yields the solution 

a = Tr{Pa')S. Hence we may assume that the length of a solution has increased 
by at most an exponential factor. Since we are going to perform the search in 
a graph of at most exponential size, we get automatically a doubly exponential 
upper bound for the length of a minimal solution by backwards computation 
on such a path. This is still the best known upper bound (although an singly 
exponential bound is conjectured), see [17]. 



5 The Search Graph Contains a Path to a Terminal Node 

This section is a proof of the existence of a path to a solvable solution in the 
Search Graph. The technique used is a generalization of the one used in [18] for 
word equations, in [9] for free group equations, and in [3] for word equations with 
regular constraints. Due to lack of space in this extended abstract we focus only 
on some few points where the technique differs substantially from those papers. 
For the other parts we will just refer the reader to the papers above. 
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The Exponent of Periodicity. Let w G F* he a, word. The exponent of 
periodicity exp(rt;) is defined as the supremum of the a G N such that w = up°"v 
for suitable u,v,p G F* and p yf 1. It is clear that exp(w) > 0 if tu is not empty. 
For an equation E = {F, Q, h, p, L, R) the exponent of periodicity, denoted by 
exp(E), is defined as 

exp(E) = inf{{ exp(cr(L)) | crisasolutionofE } U {oo} }. 

The well-known result from word equations [11] transfers to the situation here: 
in order to prove Theorem 2 we may assume that Eq is solvable and exp(i?o) G 
20(d+raiogn)^ word equations with regular constraints in done in [3] 

and for monoids with involution in [8]. A combinations of these methods give 
what we need here. The detailed proof has been given in [10]. 

Free Intervals. The following development will be fully justified at the end of 
the subsection and has to do with handling the constraints. Without constraints, 
free intervals of length more than one do not appear in a minimal solutions, 
making this notion unnecessary. This is not true in the presence of constraints. 
Free intervals handle this case and moreover, tell us that the bounds on the 
exponent of periodicity are the only restriction we need on solutions. 

Given a word w G F*, let {0, . . . , jruj} be the set of its positions. An interval 
on these positions is a formal object denoted [a,/?] with 0 < a,/? < [ru], and 
[a,j3] = [/3, a]. For w = ai ■ ■ ■ am, we define w[a,P] = Ua+i • • • ap if a < f3, 
w[a, (3] = Ga+i • • • ap if a > /3, and the empty word if a = p. Observe that these 
notations are consistent so that w[a, P] = w[a,P]. 

Let CTo be a solution of L = R, where Lq = x\ - • • Xg and Rq = Xg+i • • • Xd 
and Xi G {Fq U Oq). Then we have wq = ao(Lo) = ao(Ro). Denote mg = [luo]. 
For each t G {1, . . . , d} we define positions l(t) and r(f) as follows: 

l(i) = ]cro(a;i • • -Xi-i)] modmo G {0, . . . ,too - 1}, 
r(i) = \(7g{xi+i ■■■Xd)\modmo G {l,...,mo}. 

In particular, we have 1(1) = l{g -I- 1) = 0 and r(p) = r(d) = rrio- The set of 
1 and r positions is called the set of cuts. There are at most d cuts which cut 
the word wg in at most d — 1 factors. We say that [a, P] contains a cut 7 if 
min{a,/3} < 7 < max{o!,/3}. 

For convenience we henceforth assume 2 < g < d < mg whenever necessary 
and make the assumption that ao{xi) yf 1 for all 1 < i < d (e.g. a guess in some 
preprocessing) . 

We have crg(xi) = ruo]^*), r(*)] and crg('Fi) = ■u;o[r(i), l(t)] for 1 < t < d. By 
our assumption, the interval [l(i),r(i)] is positive. 

Let us consider i,j G 1, . . . , d and Xi = xj or Xi = xj. For 0 < p,,n < r(i)—l(i), 
we define a relation ~ among intervals as follows: 

[!(*) + d, !(*) + H ~ W) + g, l(j) + v],ifxi = Xj, 

[!(*) + d, 1(*) + H ~ [r(j) - d, r(j) -r'],ifxi = x]. 

Note that ~ is a symmetric relation and [a,P] ~ [a' ,P'\ implies both [P,a] ~ 
[P' ,a'] and wg[a,P] = wg[a',P']. By « we denote the equivalence relation ob- 
tained by the reflexive and transitive closure of 
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An interval [a, P] is called free if none of its ^-equivalent intervals contains 
a cut. Clearly, the set of free intervals is closed under involution and whenever 
\P — a\ < 1 then [a,/3] is free. It is also closed under taking subintervals: 

Lemma 6. Let [a, 0 \ he a free interval and min{o!,/3} < < max{a,/3}. 

Then the interval [fi, v] is also free. 

If [a, P] (assume a < /3) is not free, then by definition there is some interval 
[a' ,P'\ « [a,P] which contains a cut 7'. The propagation of that cut to [a,/3], 
that is the position 7 such that 7 — a = I7' — o;'| is called an implicit cut of 
[a,P]. 

The following observation will be used throughout: If we have a < /i < 7 < 
V < P and 7 is an implicit cut of [a,/3], then 7 is also an implicit cut of [p,,v\. 
(The converse is not necessarily true.) 

Lemma 7 . Let Q < a < a' < P < P' < he such that [a,/3] and [a',/3'] are 
free intervals. Then the interval [a, P'] is free, too. 

A free interval [a, P] is called maximal free if no free interval properly contains 
it, i.e., if a' < min{o;, P} < max{o;, P} < P' and [a' , P'] free, then and P' — a' = 
\P~a\. So Lemma 7 states a key point that maximal free intervals do not overlap. 

Lemma 8. Let [a, P] he a maximal free interval. Then there are intervals [7, 5 ] 
and [7^(5'] such that [a,P] « [7,^] ~ 7 cuts. 

Proposition 1 . Let T he the set of words w G Tq such that there is a maximal 
free interval [a,P] with w = WQ[a,P\. Then T is a subset o/Cg*" of size at most 
2 d — 2 . The set T is closed under involution. 

Proof. Let [a,P] be maximal free. Then \P — a\ > 1 and [P,a] is maximal free, 
too. Hence T C Tq and T is closed under involution. By Lemma 8 we may 
assume that a is a cut. Say a < p. Then a yf mo and there is no other maximal 
free interval [a, P'] with a < P' because of Lemma 7. Hence there are at most 
d — 1 such intervals [a, P]. Symmetrically, there are at most d — 1 maximal free 
intervals [a, P] where P < a and o? is a cut. □ 

Why Free Intervals Are Needed. For a moment let us put A = ToUT where 
r is the set defined in Proposition 1. Observe that A C T^ , and so it defines 
a natural projection tt : Tg — > Z\ and a mapping h' : Tq ^ M hy h' = hoir. 
(Note that here we need the fact that there is no overlapping among maximal 
intervals.) Consider the equation with constraints 7r*(i?o). There is an arc from 
Eq to tt*{Eq) since we may always allow the base change to be the identity and 
the shift to be an inclusion. 

The reason to switch from Eq to A is that, due to the constraints, the word 
Wo may have long free intervals. Over A this can be avoided. Formally, we replace 
wq by a solution Wq where Wq G T*, whose definition is based on a factorization 
of Wq in maximal free intervals. Recall that there is a unique sequence 0 = «o < 
< • • • < Ofc = mo such that [oi-i, a^] are maximal free intervals and 



Wo = wo[ao,ai] ■ • • wo[afc - l,afc]. 
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Moreover, all cuts occur as some so we can think of the factors tuo[Q;i-i, 
as letters in F. Because all constants which appear in Lq: Ro are elements of F, 
the equation Lq = -Ro appears identical in 7r*(i?o). 

So, replacing wq by the word Wq G F*, we can define a : [2 ^ F* such 
that both a{Lo) = cr(Ro) = w'o and po = h'^a, that is, ct is a solution of 
Clearly we have wq = 7t(wq) and exp(rcQ) < exp(rt;o)- The crucial point is that 
Wq has no long free intervals anymore. (With respect to Wq and Fq, all maximal 
free intervals have length exactly one.) 

We can assume that Plandowski’s algorithm follows in a first step exactly 
the arc from Eq to 7r*(i?o). Phrased in a different way, we may assume that 
Eo = t^*{Eo), hence T is a subset Fq. 

Moreover, the inclusion fi : F Fq defines an admissible base change. 
Consider Eq = /3*(7r*(ifo)). Then we have Eq = (T, F2q, h, pq, Lq, Rq) where h is 
the restriction of Hq : Fq ^ M. The search graph contains an arc from Eq to Eq 
and Eq has a solution a with ct(Lo) = Wq with exp(r<;g) < exp('u;o)- 

In summary, in order to save notations we may assume for simplicity that 
Eq = Eq and wq = Wq. We can make the following assumptions: 



Lq = xi ■ ■ ■ Xg and g > 2, 

Rq = Xg+i ■ ■ ■ Xd and d > g, 

Fq = F and |T| <2d-2, 

|f?o| ^ 2d, 

M C 

All variables X occur in LqRqLqRq. There is a solution a \ Qq ^ F such that 
Wq = (j{Lq) = cr(i?o) with a{Xi) yf 1 for 1 <i < d and pq = ha = Hqu. We have 
|rco| = mo and exp(ix;o) G All maximal free intervals have length 

exactly one, i.e., every positive interval [a, P] with /3 — o? > 1 contains an implicit 
cut. 

The Factorization. For each integer £,!<£< niQ, we define the set of critical 
words Cl as the closure under involution of set of all words WQ['y — £,^ + £] where 
7 is a cut with £ < ^ < itiq — £. 

A triple (u,w,v) G ({1} U F^) x T+ x ({1} U F^) is called a block if, first, 
first, up to a possible prefix or suffix no other factor of the word uwv is a critical 
word, second, rt yf 1 if and only if a prefix of uwv of length 2£ belongs to Ci, 
and third, yf 1 if and only if a suffix of uwv of length 2£ belongs to Ci. The set 
of blocks is denoted by Bi and can be viewed (as a possibly infinite) alphabet 
with involution defined by (u,w,v) = (v,w,u). 

We can define a homomorphism — > F* by T7i{u, w,v) = w G T’*' being 

extended to a projection tti : {Bi U F)* F* by leaving F invariant. We define 

hi : {Bi U F) ^ M by hi = hwi. In the following we shall consider finite subsets 
Fi C Bi U F which are closed under involution. Then by : F^ —>■ F* and 
hi : F^ M we understand the restrictions of the respective homomorphisms. 

For every non-empty word w G F^ we define its £- factorization as: 

Fi{w) = (ui,wi,vi) ■ ■ ■ (uk,Wk,Vk) G B^ (1) 

where w = w\ ■ ■ ■ Wk and for 1 <i < k the following conditions are satisfied: 
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— Vi is a prefix of Wi+i • ■ -Wk and = 1 if and only if i = k. 

— Ui is a, suffix of • • • Wi-i and = 1 if and only if i = 1. 

Note that the ^-factorization of a word w is unique. For a factorization (1), we 
define head^(r(;) = wi, body^(rc) = W 2 • • ■ Wk-i and taib(rt;) = Wk- Similarly 
for Head^(r(;) = (ui,wi,vi), Body^(w) = (^ 2 , W 2 , U 2 ) • • • (wfc-i, Wfc-i, Wfc-i), and 
Taib('u;) = (uk,Wk,Vk)- For A: > 2 (in particular, if body£(r<;) yf 1) we have 

Fi{w) = Head^(r(;)Body£(r(;)Taib(ri;) and w = head£(rc)body^(rc)taib(rc). 

Moreover, U2 is a suffix of wi and Vk-i is a prefix of Wk- 

Assume body^(rc) yf 1 and let u,v G F* be any words. Then we can view w in 
the context uwv and Body£(w) appears as a proper factor in the ^-factorization 
of uwv. More precisely, let Fi{uwv) = (ui, wi, ui) • • • {uk,Wk, Vk). Then there are 
unique 1 < p < q < k such that: 

Fi{uwv) = {ui,Wi,Vi) • • • {Up,Wp,Vp)Body (^{w){Uq,Wq,Vq) • • • (uk,Wk,Vk) 
w\ • • • Wp = u headf (ic) and Wq ■ • • Wk = taib(ru)v 

Finally, we note that the above definitions are compatible with the involution. 
We have Fi{W) = Fi{w), Head£(uJ) = Taib('ic), and Body^(uJ) = Body£(w). 

The £- Transformation. Recall that Eq = {F, [2q, h, po, xi ■ ■ ■ Xg, Xg+i ■ ■ ■ Xd) 
is our equation with constraints. We start with the f-factorization of wq = 
a{xi ■■ - Xg) = a{xg+i ■ ■ ■ Xd). Let 

Fiiwo) = (ui,wi,vi) ■ ■ ■ (uk,Wk,Vk). 

A sequence S = (up,Wp,Vp) ■ • • (uq,Wq,Vq) with l<p<g<fcis called 
an (.-factor. We say that S' is a cover of a positive interval [a,/?], if both 
|rci • • • Wp-i I < a and \wq+i • • - Wk\ < ttiq — /3. Thus, rco[o;, (3] becomes a factor of 
Wp - • • Wq. It is called a minimal cover if neither (up+i, rup+i, Vp+i) • • • {uq, Wq, Vq) 
nor {up, Wp, Vp) ■ • • (uq-i,Wq-i,Vq-i) is a cover of [a, (3]. The minimal cover exists 
and it is unique. 

We let (2i = {X G Qq \ body£(CT(A)) yf 1 }, and we are going to define a 
new left-hand side Li G {Bi U f2i)* and a new right-hand side Ri G {Bi U f2i)* . 
For Li we consider those I < i < g where body i{a{xi)) yf 1. Note that this 
implies Xi G fii since A > 1 and then the body of a constant is always empty. 
Recall the definition of l(i) and r(i), and define a = 1(f) -k |head^(CT(a:i))| and 
(3 = r(i) — \tai\i{(7{xi))\. Then we have wn[a,(3] = body^(cr(a;i)). Next consider 
the A-factor Si = {up, Wp, Vp) • • • {uq, Wq, Vq) which is the minimal cover of [a, (3]. 
Then we have 1 < p < q < k and Wp---Wq = Wn[a,j3] = hodyi(a{xi)). The 
definition of Si depends only on Xi, but not on the choice of the index i. 

We replace the A-factor Si in Fi{wo) by the variable Xi. Having done this for 
all 1 < i < g with body^(cr(xi)) yf 1 we obtain the left-hand side Li G {Bi U (2i)* 
of the ^-transformation Ei. For Ri we proceed analogously by replacing those 
A-factors Si where body i(a{xi)) yf 1 and g -b I < i < d. 

For Ei we cannot use the alphabet Bi, because it might be too large or 
even infinite. Therefore we let F^ be the smallest subset of Bi which is closed 
under involution and which satisfies LiRi € {F^ U f2i)*. We let Fi = E^U E. 
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The projection F* and the mapping hi : Fg ^ M are defined by the 

restriction of tti : Bi ^ F*, Tri{u,w,v) = w and hi{u,w,v) = h{w) € M and by 
7Tf(a) = a and hi{a) = h{a) for a € F. 

Finally, we define the mapping pi : f2i ^ M hy pi{X) = /i(body£(CT(X))). 
This yields the definition of the ^-transformation: Ei = (Ff, fii, hi, pi, Li, Ri). 

The Transformation Ei Is Admissible. The proof of the following proposi- 
tion uses standard techniques like those in [18] and [9] and it is therefore omitted. 

Proposition 2. There is a polynomial of degree four such that each Ei is ad- 
missible for all i > 1. 

At this stage we know that all ^-transformations are admissible. Thus, the 
equations Ei,. . . , E^o are nodes of the search graph. What is left to prove is that 
the search graph contains arcs Eq E\ and Ei if^+i for 1 < ^ < F <2^. This 
involves again the concept of base change, projection, and shift. But the presence 
of constraints does not interfere very much anymore. t Thus, the technical details 
are similar to those of Plandowski’s paper [18] as generalized in [9]. 
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Abstract. We investigate a refined recursive coloring approach to con- 
struct balanced colorings for hypergraphs. A coloring is called balanced if 
each hyperedge has (roughly) the same number of vertices in each color. 
We provide a recursive randomized algorithm that colors an arbitrary 
hypergraph (n vertices, m edges) with c colors with discrepancy at most 
log m). The algorithm has expected running time 0{nm\ogc). 
This result improves the bound of 0{^Jn log(cm)) achieved with prob- 
ability at least | by a random coloring that independently chooses a 
random color for each vertex (fair dice coloring). 

Our approach also lowers the current best upper bound for the c-color 
discrepancy in the case n = m to 0(^y^lo^) and extends the algorithm 
of Matousek, Welzl and Wernisch for hypergraphs having bounded dual 
shatter function to arbitrary numbers of colors. 



1 Introduction and Results 

One problem in the field of combinatorial optimization well-known for its hard- 
ness is the problem of balanced hypergraph colorings, also called combinatorial 
discrepancy problem. Our goal is to color the vertices of a given hypergraph in 
such a way that all hyperedges (simultaneously) are colored in a balanced man- 
ner. Balanced in this context shall mean that each edge has roughly the same 
number of vertices in each color. Equivalently, we may ask for a partition of the 
vertex set which induces a fair partition on all hyperedges. 

So far, the discrepancy problem has mainly been investigated for two colors. 
It has found several applications. Most notably is the connection to uniformly 
distributed sets and sequences which play a crucial role in numerical integration 
in higher dimensions (quasi-Monte Carlo methods) . This area is also called ge- 
ometric discrepancy theory. An excellent reference on geometric discrepancies, 
their connection to combinatorial ones and applications is the book of Matousek 
[Mat99] . The notion of linear discrepancy of matrices describes how well a solu- 
tion of a linear program can be rounded to an integer solution (lattice approx- 
imation problem). Due to work of Beck and Spencer [BS84] and Lovasz et al. 
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methoden’, Deutsche Forschungsgemeinschaft. 
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[LSV86], the linear discrepancy can be bounded (in a constructive manner) by 
combinatorial discrepancies. Further applications are found in computational ge- 
ometry. For this and other applications of discrepancies in theoretical computer 
science we refer to the new book of Chazelle [ChaOO]. 

Recent work in communication complexity theory [BHK98] motivates the 
study of balanced colorings in arbitrary numbers of colors. This was begun in 
[DS99]. It turned out that information about the 2-coloring problem in general 
does not yield any information on the coloring problem in c colors, c G N> 2 . 
On the other hand, a recursive method was given that yields for any number 
of colors a coloring with imbalance not larger than roughly twice the maximum 
2-color discrepancy among all subhypergraphs. 

In this paper we extend this approach to make use of the additional assump- 
tion that subhypergraphs on fewer vertices have smaller discrepancy. This is a 
natural assumption justified by many examples. Roughly speaking we show that 
if the 2-color discrepancy of the subhypergraphs on no vertices is bounded by 
O(riQ) for some constant a g] 0, 1[, then the c-color discrepancy is bounded by 
0((^)“). It seems surprising that this bound is achievable by a recursive ap- 
proach, as the first step in the recursion will find a 2-coloring for the whole 
hypergraph with discrepancy guarantee 0(n°‘) only. We still get the 
discrepancy for the final coloring due to the fact that imbalances inflicted in 
earlier rounds of the recursion are split up in a balanced manner in later steps. 
It turns out that this effect even exceeds the effect of decreasing discrepancies of 
smaller subhypergraphs. Crucial therefore is the last step of the recursion where 
colorings for hyper graphs on roughly ^ vertices are looked for. 

There are some further difficulties, like how to handle numbers of colors that 
are not a power of 2, and how to guarantee that the color classes become signif- 
icantly smaller, but we manage to do this in a way that the result is applicable 
to several problems. 

For the general case of an arbitrary hypergraph having n vertices and m edges 
the 2-color case is well understood. A fair dice coloring, that is, one that colors 
each vertex independently with a random color has discrepancy 0(y/nlogm). 
For m significantly larger than n, this is known to be tight apart from con- 
stant factors. Extending this approach to c-colors, we found in [DS99] that a 

fair dice c-coloring has discrepancy at most ^ In (4cm) with probability at 

least i . In this paper we show that better random colorings can be constructed 
by combining the 2-color fair dice colorings with a recursive approach. This al- 
lows to compute a (^coloring with discrepancy ©(y^ ^ log(m)) in expected time 
0(nm log c). 

Our recursive approach can be applied to several further multi-color discrep- 
ancy problems of which we mention two. It shows that for n = 0(m) there is a 
(^coloring with discrepancy 0(y^^log c) (instead of 0{y/n) as shown in [DS99]). 
This extends a famous result of Spencer [Spe85] . We also extend an algorithm due 
to Matousek, Welzl and Wernisch [MWW84] for hypergraphs having bounded 
dual shatter function to arbitrary numbers of colors. 
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2 Preliminaries 

We shortly review the key definitions of traditional 2-color discrepancy theory 
and the multi-colors ones from [DS99]. 

Let 7i = (X,S) denote a finite hypergraph, i. e. X is a finite set (of vertices) 
and 5 is a family of subsets of X (called hyperedges). A partition into two classes 
can be represented by a coloring \ ■ X We call —1 and -1-1 colors. 

The color classes X“^(— 1) and X~^(+l) form the partition. The imbalance of 
a hyperedge E G S is expressed by x(^) •= '^xge xi^)- The discrepancy of hi 
with respect to x is defined by disc(7f,x) = max^gf: \x{E)\. 

For Xo C X we call 7f|Xo •= (-^o, {Er\Xo\E G £}) an induced subhypergraph 
of 7i. As the discrepancy of an induced subhyper graph cannot be bounded in 
terms of the discrepancy itself, it makes sense to define the hereditary discrepancy 
herdisc(Ti) to be the maximum discrepancy of all induced subhypergraphs. 

This is all we need from the classical theory, so let us turn to c-color discrep- 
ancies. For technical reasons we need a slight extension of the c-color discrepancy 
notion, which refers to the problem of coloring a hypergraph in a balanced way 
with respect to a given ratio. A vector p G [0, 1]° such that ||p||i = 'n,i^[c]Pi ~ ^ 
shall be called a weight for c colors. A c-coloring of Ti. is simply a mapping 
X : X ^ M, where M is any set of cardinality c. For convenience, normally 
one has M = [c] := {1, . . . , c}. Sometimes a different set M will be of advan- 
tage. Note that in applications to communication complexity M can be a finite 
Abelian group [BHK98] . The basic idea of measuring the deviation from the aim 
motivates the definitions of the discrepancy of an edge E G £ in color i G M 
with respect to \ ci'nd p by 

disCx,i.p(A) := ||x"^(*) n E\ - p^\E\ \ . 



We call 

disc(7f , X, i,p) ■= disCp^_i^p(i?) 

the discrepancy of H with respect to \ cind p in color i. The discrepancy of H 
with respect to \ (ind p then is 

disc(7f, x^ p) := max disCy iJE), 

and finally the discrepancy of H with respect to the weight p is 
disc(7f, c, p) := min disc(7f, x,p)- 

X-X^lc] 

We return to our original problem of balanced coloring if we take p = (the 
c-dimensional vector with entries ^ only) as weight. In this case we will simply 
omit the extra p in the definitions above, i. e. disc(7f,c) := disc(7f, c, ^Ic). 
In this notation we have disc(7f,2) = idisc(Tf). The reason for this slightly 
strange relation is that the usual 2-color discrepancy notion does not compare 
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the number of points of an hyperedge in one color with half the cardinality of 
the hyperedge, but twice this value due to the —1, +1 sums. 

We further note that in the case of 2 colors the discrepancies in both colors are 
equal. A consequence of the relation between linear discrepancy and hereditary 
discrepancy discovered by [BS84] and [LSV86] is 

Lemma 1. For all hypergraphs H = (X,S) and all 2-color weights p G [0, 1]^ 
we have 

disc(7f,2,p) < herdisc(Tf). 

This is constructive in the following sense: For all h G K.>o a 2-coloring x such 
that disc(7f, x,p) < h + e\X\ holds can he computed by 0(loge“^) times comput- 
ing a coloring having discrepancy at most h for some induced subhypergraph. 

An excellent survey of classical and recent results in combinatorial discrep- 
ancy theory is the article of Beck and Sos [BS95], which also contains a proof 
of Lemma 1. For very recent developments we refer to Chapter 4 of Matousek’s 
book on geometric discrepancies [Mat99]. 

3 General Approach 

The basic idea of recursive coloring is simple: Color the vertices of the whole 
hypergraph with two colors in such a way that the discrepancy is small, then 
iterate this on the resulting color classes. There are two points that need further 
attention: 

Firstly, this simple approach only works if the number of colors is a power of 
2. This is the reason why we use a discrepancy notion respecting weights. Thus 
in the case of 3 colors for example, we would look for a 2-coloring respecting 
the ratio (|, |) and then further split the second color class in the ratio (|, |). 
There is no general connection between ordinary discrepancy and discrepancy 
respecting a particular weight (for the same reason, as there is no general connec- 
tion between the discrepancies in different numbers of colors). If the hereditary 
discrepancy is not too large, then Lemma 1 allows to compute low discrepancy 
coloring with respect to a given weight. As we even assume that the discrepancy 
decreases for subhypergraphs on fewer vertices, we can apply this bound without 
greater loss. 

A second point is that to use this assumption of decreasing discrepancies we 
need to make sure that the vertex sets considered actually become smaller. Un- 
fortunately, in general we do not know the size of the color classes generated by a 
low discrepancy coloring. If the whole vertex set is a hyperedge, we know at least 
that the sizes of the color classes deviate from the aimed at value by at most the 
discrepancy guarantee. This is not too bad if the discrepancy is relatively small, 
but even then keeping track of these deviations during the recursion is tedious. 
Better bounds seem achievable by the cleaner approach of only investigating fair 
colorings, that is, those which have discrepancy less than one on the set of all 
vertices. 
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To ease notation let us agree the following. Let p € [0, be a c^color weight 
and 7i = {X,E) a hypergraph. We say that x a fair p-coloring of H having 
discrepancy at most di in color i G [c] to denote that 

— X is a c-coloring of H, 

— X is fair with respect to p, that is, for all i e [c] we have | |x“^(*)|— I < 1, 

— the discrepancy of H with respect to x and p in color i G [c] is at most di. 

One remark that eases work with the fractional parts: Let us call a weight 
p G [0,1]^^ integral with respect to TL (or 7f-integral for short) if all Pi,i G [c] 
are multiples of From the definition it is clear that a fair coloring x with 
respect to an integral weight p fulfills |x~^(*)l = Pi\X\ for all colors i G [c]. On 
the other hand, suppose that we know that for a given hypergraph and for all 
integral weights p there is a fair p-coloring that has discrepancy at most k. Then 
there are fair colorings having discrepancy at most fc + 1 for any weight: For an 
arbitrary weight p there is an integral weight p' such that \pi — p( | < holds 
for all i G [c]. Therefore, a fair coloring with respect to p' is also fair with respect 
to p, and its discrepancy with respect to p is larger (if at all) than the one with 
respect to p' by less than one. For these reasons we may restrict ourselves to the 
more convenient case that all weights are integral. 

Using the following recoloring argument we can transform arbitrary colorings 
into fair colorings. 

Lemma 2. Let H = {X, S) be a hypergraph such that X G S. Let p he a 2-color 
weight. Then any 2-coloring \ of H can he modified in 0(|X|) time into a fair 
p-coloring such that disc(7^,x^p) < 2disc(?f, x,p)- 

We omit the proof. To analyze our recursive algorithm we need the following 
constants. Let a G ]0, 1[. For each p G ]0, 1[ define Va{p) to be 



max 



k i k 






k 'I 

kGN,qi,.. .,qk-i G [0, §],% G [0,l],J]^gj = p > . 



Set Cq := (1)*^^ ■ Then we have 

Lemma 3. Let a G ]0, 1[. 

ftj Let0<p<q<§. Then q°^Va{^) + < Va{p). 

(a) For all p G [0, 1], ^ CqP“. 

Proof. We skip the first claim which is not too difficult. 

Let fc G N, (7i, . . . , qk-i G [0, |], G [0, 1] such that Yl%i = P and Va{p) = 

J2i=i n}=i Qj rii=i+i Qj- For i G [k] set Xi := Qj Qj- Then Xk = p“ 

and Xk-i < Xk. For i G [fc — 2] we have 



^k—l—i 



^k-l — i-\-l 



Qk-l-i l_a ^ 

~ ^k-l-i — Is/ 

^k-l-i 




188 



Benjamin Doerr and Anand Srivastav 



and hence Xk-i-i < Thus 



2T^V^{p) = ^ 1 + H(i) 






p“ = c„p“. 



i=0 



Here is the precise setting we investigate in this section: 

Assumption 1. Let H = {X,£) be a hypergraph. Set n := |A|. Letpo,a g]0, 1[ 
and D > 0. For all Xq C X such that |Ao| > _Po|-^| and all q € [0, 1] such that 
{q, 1 — q) is H\Xo~'^ntegral there is a fair {q, 1 — q)-coloring x of 'H^Xo having 
discrepancy at most D|Ao|“. 

In addition to what we already explained there is one further detail involved 
in our assumption. As we do recursive partitioning, we never need a discrepancy 
result concerning induced subhypergraphs on fewer than ^ vertices (in the equi- 
weighted case). This observation will be useful in some applications, e. g. in the 
case \£\ = |A|. 

Concerning the complexity there are two possible measures. We can count 
how many 2-colorings have to be computed, or how often a 2-coloring for a 
vertex has to be found. The latter is useful if the complexity of computing the 2- 
colorings is proportional to the number of vertices of the induced subhypergraph 
as in Section 4. 



Theorem 2. Suppose that Assumption 1 holds. Then for each TL-integral weight 
p G [0, 1]'^ there is a fair p-coloring \ of Ti such that the discrepancy is at most 
_i Dva{pi)n°‘ < Dca{pin)°^ in all those colors i G [c] such that pi > po- 



Such colorings can be obtained by computing at most (c— 1) |log 2 (^) | colorings 

as in Assumption 1. At most Snlogj^ 5 (^) times a color for a vertex has to be 
computed. 



For the proof we first show a stronger bound for the 2-color discrepancy with 
respect to a weight (q, 1 — q), if q is small. 

Lemma 4. Suppose that Assumption 1 holds. Then for each TL-integral weight 
p = (2“^, 1 — 2“*), 2“^ > po; a fair p-coloring x having discrepancy at most 



fc-i 

disc( 7 f,x,p) < ^ 2-'=+^+*2-“* 

i=0 

can be computed from k colorings as in Assumption 1. This requires 2“*n < 

2n times computing a color for a vertex. 

We omit the proof here. From our assumptions on TL it is clear that the 
assertion of Lemma 4 also holds for any induced subhyper graph TL\Xo of as 
long as 2“*|Ao| > Po|A|. We use this fact to extend Lemma 4 to arbitrary 
weights. 
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Lemma 5. Suppose that Assumption 1 holds. For each H-integral weight {q, 1 — 
<?)) Po < <Z < there is a fair (g, 1 — q)-coloring having discrepancy at most 

{qn)°‘ . A coloring of this kind can be computed by log 2 (^) times com- 
puting a coloring as in Assumption 1. This requires at most 3n times computing 
a color for a vertex. 

Proof. Let fc S No be maximal subject to the condition that q' = 2^q < 1. Since 
{q, I — q) is H-integral, so is {q' , I — q'). According to our assumptions there 
is a fair {q',1 — g')-coloring \o ■ X ^ [2] having discrepancy at most I?n“. 
From |xo^(l)l = we have ^|xo^(l)l = q\X\ G Nq. Hence (^,1 - is 

(H|^-i(j^))-integral. By Lemma 4 we may compute a fair (^,1 — ^)-coloring 

Xi • ^ [2] that has discrepancy at most 

Define a coloring x : A ^ [2] by x(^) = 1 if and only if xo(a^) = 1 and 
Xi(x) = 1. Then x is a fair (q, 1 — g)~coloring. For an edge E G S we compute 
its discrepancy in color 1: 

\\Enx-Hl)\-q\E\\ 

= ||ifnxo-'(i)nxr'(i)l -91^^11 

< lAnxo-^(i) nxr'WI - ^|Anxo-'(i)l| + |^|Hnxo-'(i)l - q\e\ 

= \\En xo\i) n xr'(i)l - 2-'=|H n xo”'(i)l| + 2"'= ||h n xo”'(i)l - q'H 

k-1 

< 2-'=+^+*2-“*D(g'n)“ + 2-'=Dn“ 

i=0 

<y — (y.k o 

< 2 < z '“ t ^ -D{qnT. 

Note that q' = then we may compute x directly using Lemma 4. There- 
fore the computation of x requires log 2 (|) times computing a coloring assured 
by Assumption 1. Computing xo means computing a color for n vertices. By 
Lemma 4, xi can be computed by at most 2q'n times computing a color for a 
vertex. To get x we therefore computed at most 3n times a color for a vertex. 
This proves Lemma 5. □ 

Proof (of Theorem 2). To make the recursion work properly we need to fix a set 
C of colors at the beginning. A weight then is a vector p = {piji^c indexed by the 
colors, or, more formally, a function p : C — > [0, 1], such that ||p||i = ^^^cPi = 1. 
To avoid trivial cases we shall always assume that no color i G C has the weight 
Pi = 0. 

We analyze the following recursive algorithm: 

Input: A hypergraph Tt = {X,E) fulfilling Assumption 1, a set C of at least 2 
colors and an H-integral weight function p : C ^ [0, !]• 

Output: A coloring x : A ^ C as in Theorem 2. 
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1. Choose a partition {Cl, C2} of the set of colors C such that |b|Cil|i> lb|C2lli ^ 

I or Cl contains a single color with weight at least Set (gi,<72) := 
(Ibicilli, Ibicsili)- 

2. Following Lemma 5, compute a fair (<71, g2)-coloring xo ■ ^ ^ [2] that has 
discrepancy at most 2i-a_i -P(gi^)° in color t = 1,2 if > po. Set Xi := 

for 1=1,2. 

3. For i = 1, 2 do 

if |C,| > 1, 

then by recursion compute a fair coloring Xi '■ —>■ Ci with respect to the 

weight fop|Ci having discrepancy at most 2i-a_i Dva{^){qin)°' in each 
color j e Ci, pj > Po 

else if Ci = {j} for some j G C, set 1— > {j}. 

4. Return y : AT ^ C defined by y(x) := yi(x), if x G Xi, and y(x) := y2(a;), 
if a; G X 2 , for all x G X. 

We prove that our algorithm produces a coloring as claimed in Theorem 2 
and also fulfills the complexity statements. Suppose by induction that this holds 
for sets of less than c colors. We analyze the algorithm being started on an input 
as above with |C| = c. 

We first show correctness. For Step 1 note that both Ci and C2 are non- 
empty and that <72 < | holds. Therefore by Lemma 5 and induction the colorings 
Xi, t = 0, 1,2 can be computed as desired in Step 2 and 3. Let E G £, i G [2] 
and j G Ci such that pj > po- If ICi] > 1, then 

= \ \E n Xo\i) x7\j)\ - Pj\E\\ 

< |if nyo"b*) ny-bj)l - ^\Enxo\^)\\ + l^\Enxo\i)\-Pj\E\ 

< \{Enx,)f^x-\J)\-^\Enx,\\ + ^^\\Enxo\^)\-<l^\E\\ 

< 2j^rziDva{pj)rC 

by Lemma 3 (i). On the other hand, if Ci contains a single color j, then pj = qi 
and 

\\Er\x~^{j)\-Pj\E\\ = \\E n Xo^{i)\ - qz\E\\ 

< 2^-<--l EVa{Pj)n°‘ ■ 

This is the correctness statement. 

Concerning the complexity note that the computation of xo takes at most 
log2(fo) and the one of the yi takes at most (I Ci I — 1) fog2(|^) colorings as 

in Assumption 1. These are not more than (c— 1) riog2(^)j colorings altogether. 
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By Lemma 5 we compute at most 3n times a color for a vertex in Step 
2. If I Ci I > 1 for both i = 1,2, then <7i < | and computing Xi involves at 
most 3(jinlog2 5(1^) < Sqiulogi times computing a color for a vertex. 

Altogether this makes at most 3n + 3<7inlogi 5(|^) + 3q2nlogi < 3n(l + 
logi 5(3!^^)) = 3nlog2.5(^) times computing a color for a vertex. If \Ci\ = 1 
then there is nothing to do to get Xi respective term just vanishes in 

the calculation above. □ 

4 A Randomized Algorithm for Arbitrary Hypergraphs 

Let = {X,£) denote an arbitrary hypergraph. Set n := |X| and m := \£\ for 
convenience. In [DS99] it was shown that a random coloring generated by coloring 
each vertex independently with each color with probability ^ has discrepancy 

at most dnln(4mc) with probability at least This can be used to design a 
randomized algorithm computing such a coloring by repeatedly generating and 
testing such a random coloring until its discrepancy is at most 2^1n(4mc). 

In this section we show that via the recursive approach of Theorem 2 a better 
bound can be achieved. This also proves that the discrepancy decreases for larger 
numbers of colors. 

Theorem 3. For each H-integral c-color weight p a c-coloring x having dis- 
crepancy at most disc(7f, x,p, i) < 45^pinln{4m) in color i G [c] can he com- 
puted in expected time 0{nmlog(min{pi\i G [c]})). In particular, a c-coloring x 
such that 

disc(7f , X, c) < 45 ln(4m) + 1 
can be computed in expected time 0{nmlogc). 

Proof. There is little to do for m = 1, so let us assume that m > 2. We show 
that the colorings required by Assumption 1 can be computed in expected time 
0{\Xo\m). Denote by H the hypergraph obtained from H by adding the whole 
vertex set as an additional hyperedge. Let Aq C A and {q, 1 — g) be a 2-color 
weight. Let x • "^0 ^ [2] be a random coloring independently coloring the 
vertices with probabilities P{x{x) = 1) = q and P{x{x) = 2) = 1 — <7 for all 
X G Aq. a standard application of the Chernoff inequality (cf. [ASOO]) shows 
that 

(*) disc(7f|Xo,X, (9,1 - 9)) < ^\\XQ\\Yi{4m) 

holds with probability at least . Hence by repeatedly generating and testing 
these random colorings until (*) holds we obtain a randomized algorithm com- 
puting such a coloring with expected running time 0{nm). By Lemma 2 we get 
a fair {q, 1 — qj-coloring for H\Xa having discrepancy at most i/2|Ao| ln(4m). 
Hence for a = ^, D = y^2 ln(4m) and arbitrary po the colorings required in 
Assumption 1 can be computed in expected time 0(|Ao|m). 
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Therefore we may apply Theorem 2 with po = min{pj|i G [c]}. The dis- 
crepancy bounds follow from Cq, < 31.4. Computing such a coloring involves 
0(log(^)n) times computing a color for a vertex. As this can be done in ex- 
pected time 0{m), we have the claimed bound of 0(nm log(^)). □ 

Some remarks concerning the theorem and its proof above. For the complex- 
ity guarantee we assumed that the complexity contribution of computing the 
2-colorings dominates the remaining operations of the recursive algorithm of 
Section 3. This is justified by the fact that we may assume c < n since integral- 
ity ensures Pi > for all colors i £ C. 

A second point is that the constant of 45 could be improved by a more careful 
way of generating the random 2-colorings. In particular by taking a random fair 
coloring we could avoid the extra factor of 2 inflicted by Lemma 2. This though 
requires an analysis of the hypergeometric distribution, which is considerably 
more difficult that ours. 

Finally let us remark that the construction of the 2-colorings can be de- 
randomized through an algorithmic version of the Chernoff-Hoeffding inequality 
(cf. [SS96]). Thus the colorings in Theorem 3 can be computed by a deterministic 
algorithm as well. 

5 Further Results 

In this section we give two more applications of Theorem 2 that extend 2-color 
bounds or algorithms to c colors. 

5.1 Six Standard Deviations 

The famous “Six Standard Deviations” result due to Spencer [Spe85] states that 
there is a constant K such that for all hypergraphs H = {X, £) having n vertices 
and m> n edges 

disc(7t) < K^n\Yi{^) 

holds. 

The interesting case is of course the one where m = 0{n) and thus disc(7t) = 
0{^/n). For m significantly larger than n this result is outnumbered by the simple 
fair coin flip random coloring. The title “Six Standard Deviations Suffice” of 
this paper comes from the fact that for n = m large enough, disc(7f) < 6^/n 
holds. Using the relation between discrepancies respecting a particular weight 
and hereditary discrepancy (Lemma 1) and the recoloring argument (Lemma 2), 
we derive from Spencer’s result (without proof) 

Lemma 6. For any Xq C X and AntepraZ weight {q, 1 — p) there is a fair 
(p, 1 — q)-coloring ofTi,\Xo discrepancy at most 



2iF^|Xo|ln(M). 
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Lemma 6 and Theorem 2 yield 

Theorem 4. Let H = {X, S) denote a hypergraph having n vertices and m> n 
edges and p G [0,1]'^ an integral weight. Set po := Then there is 

a fair p-coloring having discrepancy at most 63iC ^ptn ln( in color i. In 

consequence, for |X| = \£\ = n we have 

disc(H,c) < O ^ Inc^ . 

Proof. By Lemma 6 we may apply Theorem 2 with a = ^, D = 2K 

and po- This yields a fair p-coloring having discrepancy at most Dca^/PiU in 
color i G [c]. The claim follows from Cq, < 31.4. □ 

This is quite close to the optimum. An extension of Spencer’s [Spe87] proof 
shows that hypergraphs arising from Hadamard matrices have c-color discrep- 
ancy 17 (y^). It is a famous open problem already for 2 colors whether colorings 
having discrepancy 0{,ynlog{^)) can be computed efficiently. Therefore a con- 
structive version of Theorem 4 is not to be expected at the moment. 

5.2 Dual Shatter Function Bound 

If we have some more structural information about the hyper graph, in many 
cases there are constructive solutions to the discrepancy problem. Matousek, 
Welzl and Wernisch showed that if the dual shatter function of H is bounded 
by for some constant d > 2, then a 2-coloring y such that 

disc(7t, x) = n) can be computed by a randomized polynomial 

time algorithm. The dual shatter function is monotone with respect to induced 
subhypergraph, that is, for all Aq C A we have Hence we conclude 

from Lemma 1 that Assumption 1 is fulfilled with o: = ^ — -^. We derive 

Theorem 5. If the dual shatter function of H is bounded by 
for some constant d> 2, then a c-coloring y such that 

disc(H,y,c) = 

can be computed by a randomized polynomial time algorithm. 

6 Conclusion and Discussion 

In this paper we presented a recursive method to construct c-colorings from 2- 
colorings with respect to a given weight. Our approach uses the fact that induced 
subhypergraphs on fewer vertices often have smaller discrepancies. We extend 
several 2-color results to arbitrary numbers of colors. In particular, we show that 
a clever extension of the 2-color approach of independently choosing a color for 
each vertex is not doing the same with c colors, but combining the 2-color result 
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with a recursive algorithm. This seems strange at first, but the gain of a \fc 
factor is convincing. 

We should remark that this gain is ‘real’, that is, it evolves from the difference 
in the random experiments rather than a weak analysis of the fair dice colorings. 

We believe that there are three reasons explaining the behavior. First, it is 
a general result of [DS99] that 2-coloring has the significant advantage that the 
discrepancy in both colors is the same. Therefore one actually has to take care 
of just one color. Second, instead of having just one random experiment which 
has to yield a ‘good’ coloring with sufficiently large probability, here we have 
a series of random experiments that are executed one after another. Third, all 
colorings generated by our algorithm are fair. 
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Abstract. We study effectively given positive reals (more specifically, 
computably enumerable reals) under a measure of relative randomness 
introduced by Solovay [32] and studied by Calude, Hertling, Khous- 
sainov, and Wang [7], Calude [3], Slaman [28], and Coles, Downey, and 
LaForte [14], among others. This measure is called domination or Solo- 
vay reducibility, and is defined by saying that a dominates fd if there 
are a constant c and a partial computable function ip such that for all 
positive rationals q < a we have p{q) ],< P and /? — ^ c{a — q). 

The intuition is that an approximating sequence for a generates one for 
P whose rate of convergence is not much slower than that of the original 
sequence. It is not hard to show that if a dominates P then the initial 
segment complexity of a is at least that of p. 

In this paper we are concerned with structural properties of the degree 
structure generated by Solovay reducibility. We answer a long-standing 
question in this area of investigation by establishing the density of the 
Solovay degrees. We also provide a new characterization of the random 
c.e. reals in terms of splittings in the Solovay degrees. Specifically, we 
show that the Solovay degrees of computably enumerable reals are dense, 
that any incomplete Solovay degree splits over any lesser degree, and that 
the join of any two incomplete Solovay degrees is incomplete, so that the 
complete Solovay degree does not split at all. The methodology is of 
some technical interest, since it includes a priority argument in which 
the injuries are themselves controlled by randomness considerations. 



1 Introduction 

In this paper we are concerned with effectively generated reals in the interval 
(0, 1] and their relative randomness. In what follows, real and rational will mean 
positive real and positive rational, respectively. It will be convenient to work 
modulo 1, that is, identifying n-\- a and a for any n G u) and a € (0, 1], and we 
do this below without further comment. 

Our basic objects are reals that are limits of computable increasing se- 
quences of rationals. We call such reals computably enumerable (c.e.), though 

* Downey and Hirschfeldt’s research supported by the Marsden Fund for Basic Science. 
Nies and Downey’s research supported by a US/NZ cooperative science grant. Nies’s 
research supported by NSF grant DMS-9803482. 
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they have also been called recursively enumerable, left computable (by Ambos- 
Spies, Weihrauch, and Zheng [1]), and left semicomputablef If, in addition to 
the existence of a computable increasing sequence (jo , , • • • of rationals with 

limit a, there is a total computable function / such that a — qf{n) < for 
all n G to, then a is called computable. These and related concepts have been 
widely studied. In addition to the papers and books mentioned elsewhere in this 
introduction, we may cite, among others, early work of Rice [26], Lachlan [21], 
Soare [29], and Ceitin [9], and more recent papers by Ko [18,19], Calude, Coles, 
Hertling, and Khoussainov [6], Ho [17], Boldi and Vigna [2], and Downey and 
LaForte [16]. 

A computer M is self- delimiting if, for each binary string a, M(cr) J, implies 
that M{u') t for all u' properly extending <j. It is universal if for each self- 
delimiting computer N there is a constant c such that, for each binary string a, 
if N{a) J, then M(r) |= N{a) for some r with |t] < \a\ -\- c. 

Fix a self-delimiting universal computer M . We can define Chaitin’s number 
H = ^Im via 

^ 2-l‘^l . 

M(a)l 

The properties of H relevant to this paper are independent of the choice of M. A 
c.e. real is an fl-number if it is for some self-delimiting universal computer 

M. 

The c.e. real H is random in the canonical Martin-Lof sense. Recall that a 
Martin-Lof test is a uniformly c.e. sequence {Ve : e > 0} of c.e. subsets of {0, 1}* 
such that for all e > 0, 

MK{o,in <2-*= , 

where p, denotes the usual product measure on {0, 1}“. The string a G {0, 1}“ 
and the real O.ct are random, or more precisely, 1-random, if u ^ne>0^e{0,ir 
for every Martin-Lof test {Ve : e > 0}. 

An alternate characterization of the random reals can be given via the notion 
of a Solovay test. We give a somewhat nonstandard definition of this notion, 
which will be useful below. A Solovay test is a c.e. multiset {li : i G co} of 
intervals with rational endpoints such that where ]/] is the length 

of the interval I. As Solovay [32] showed, a real a is random if and only if 
{i G uj : a G li} is finite for every Solovay test {li : i G uj}. 

Many authors have studied H and its properties, notably Chaitin [11,12,13] 
and Martin-Lof [25]. In the very long and widely circulated manuscript [32] (a 
fragment of which appeared in [33]), Solovay carefully investigated relationships 
between Martin-Lof-Chaitin prefix-free complexity, Kolmogorov complexity, and 
properties of random languages and reals. See Chaitin [11] for an account of some 
of the results in this manuscript. 

^ We recognize that the term computably enumerable real is not ideal, but it is the 
one used by Solovay, Chaitin, Soare, and others in this tradition (modulo the recent 
terminological move from recursive to computable), and the alternatives are also 
problematic; for instance, semicomputable has an unrelated meaning in computabil- 
ity theory. 
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Solovay discovered that several important properties of (whose definition 
is model-dependent) are shared by another class of reals he called fl-like, whose 
definition is model-independent. To define this class, he introduced the following 
reducibility relation among c.e. reals, called domination or Solovay reducibility. 

Definition 1.1. Let a and (3 be c.e. reals. We say that a dominates (3, and write 
(3 a, if there are a constant c and a partial computable function (/? : Q ^ Q 
such that for each rational q < a we have ip{q) l< P and 

P - V{q) < c(a - q) . 

We write P <s a if P a and a P, and we write a =s /3 if a /3 and 
P O'. 

The notation <dom has sometimes been used instead of ^s- 

Recall that the prefix-free complexity H (r) of a binary string r is the length 
of the shortest binary string a such that M{a) |= r, where M is a fixed self- 
delimiting universal computer. (The choice of M does not affect the prefix-free 
complexity, up to a constant additive factor.) Most of the statements about H{t) 
made below also hold for the standard Kolmogorov complexity K (r) . For more 
on the the definitions and basic properties of H{t) and K{t), see Chaitin [13], 
Calude [4], and Li and Vitanyi [24]. Among the many works dealing with these 
and related topics, and in addition to those mentioned elsewhere in this paper, 
we may cite Solomonoff [30,31], Kolmogorov [20], Levin [22,23], Schnorr [27], 
Chaitin [10], and the expository article Calude and Chaitin [5]. 

Solovay reducibility is naturally associated with randomness because of the 
following fact. (We identify a real a € (0,1] with the infinite binary string a such 
that a = O.CT. The fact that certain reals have two different dyadic expansions 
need not concern us here, since all such reals are rational.) 

Theorem 1.2 (Solovay [32]). Let P o be c.e. reals. There is a constant 
0(1) such that H{P [ n) ^ H {a [ n) -I- 0(1) for all n £ u). 

Solovay observed that Lt dominates all c.e. reals, and Theorem 1.2 implies 
that if a c.e. real dominates all c.e. reals then it must be random. This led Solovay 
to define a c.e. real to be fl-like if it dominates all c.e. reals. The point is that 
the definition of fl-like seems quite model-independent (in the sense that it does 
not require a choice of self-delimiting universal computer), as opposed to the 
model-dependent definition of LI. However, Calude, Hertling, Khoussainov, and 
Wang [7] showed that the two notions coincide. 

Theorem 1.3 (Calude, Hertling, Khoussainov, and Wang). A c.e. real 
is Ll-like if and only if it is an Ll-number. 

This circle of ideas was completed recently by Slaman [28], who proved the 
converse to the fact that fl-like reals are random. 

Theorem 1.4 (Slaman). A c.e. real is random if and only if it is Ll-like. 
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It is natural to seek to understand the c.e. reals under Solovay reducibility. 
A useful characterization of this reducibility is given by the following lemma. 

Lemma 1.5. Let a and j3 he c.e. reals. Then P C( if o,nd only if for every 
computable sequence of rationals oq, oi, . . . such that 







there are a constant c and a computable sequence of rationals Sq, Si, . . . < c such 
that 

P — ^ ^ • 

n^u) 

Phrased another way, Lemma 1.5 says that the c.e. reals dominated by a given 
c.e. real a essentially correspond to splittings of a under arithmetic addition. 

Corollary 1.6. Let a and P be c.e. reals. Then P O' if and only if there is a 
c.e. real 7 and a rational c such that ca = /3 + 7. 

Solovay reducibility has a number of other beautiful interactions with arith- 
metic, as we now discuss. 

The relation is symmetric and transitive, and hence =s is an equivalence 
relation on the c.e. reals. Thus we can define the Solovay degree [a] of a c.e. 
real a as its =s equivalence class. (When we mention Solovay degrees below, 
we always mean Solovay degrees of c.e. reals.) The Solovay degrees form an 
uppersemilattice (a partial ordering in which every pair of elements has a least 
upper bound, called the join of these elements), with the join of [a] and [P] being 
[a + P]=[ap\, a fact observed by Solovay and others (0 is definitely not a join 
operation here). We note the following slight improvement of this result. Recall 
that an uppersemilattice U is distributive if for all oq, ui, & G U with 6 ^ oq V oi 
there exist 60, G U such that boV bi = b and 6^ < for i = 0, 1. 

Lemma 1.7. The Solovay degrees of c.e. reals form a distributive uppersemilat- 
tice with [a] V [P\ = [a-\- P\ = [a/3] . 

There is a least Solovay degree, the degree of the computable reals, as well as 
a greatest one, the degree of fl. For proofs of these facts and more on c.e. reals 
and Solovay reducibility, see for instance Chaitin [11,12,13], Calude, Hertling, 
Khoussainov, and Wang [7], Calude and Nies [8], Calude [3], Slaman [28], and 
Coles, Downey, and LaForte [14]. 

Despite the many attractive features of the Solovay degrees, their structure is 
largely unknown. Coles, Downey, and LaForte [14] have shown that this structure 
is very complicated by proving that it has an undecidable first order theory. 

One question addressed in the present paper, open since Solovay’s original 
1974 notes, is whether the structure of the Solovay degrees is dense. Indeed, up 
to now, it was not known even whether there is a minimal Solovay degree. That 
is, intuitively, if a c.e. real a is not computable, must there be a c.e. real that is 
also not computable, yet is strictly less random that al 
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In this extended abstract, we sketch part of a proof that the Solovay degrees 
of c.e. reals are dense. In the full version of this paper, we give the complete proof 
of this result. This proof is divided into two parts: we show that if a <s then 
there is a c.e. real 7 with a <g 7 <s fl, and we also show that every incomplete 
Solovay degree splits over each lesser degree. 

The nonuniform nature of the argument is essential given the techniques we 
use, since, in the splitting case, we have a priority construction in which the 
control of the injuries is directly tied to the enumeration of fl. The fact that if a 
c.e. real a is Solovay-incomplete then must grow more slowly than a is what 
allows us to succeed. (We will discuss this more fully in Sect. 2.) This unusual 
technique is of some technical interest, and clearly cannot be applied to proving 
upwards density, since in that case the top degree is the degree of itself. To 
prove upwards density, we use a different technique, taking advantage of the fact 
that, however we construct a c.e. real, it is automatically dominated by f 2 . 

In light of these results, and further motivated by the general question of 
how randomness can be produced, it is natural to ask whether the complete 
Solovay degree can be split, or in other words, whether there exist nonrandom 
c.e. reals a and (3 such that a + (3 is random. We give a negative answer to this 
question, thus characterizing the random c.e. reals as those c.e. reals that cannot 
be written as the sum of two c.e. reals of lesser Solovay degrees. 

We remark that there are (non-c.e.) nonrandom reals whose sum is random; 
the following is an example of this phenomenon. Define the real a by letting 
a(n) = 0 if n is even and a{n) = fl{n) otherwise. (Here we identify a real with 
its dyadic expansion as above.) Define the real /3 by letting f3{n) = 0 if n is odd 
and /3(n) = H(n) otherwise. Now a and f3 are clearly nonrandom, but a + /3 = H 
is random. 

Before turning to the precise statements of our main results and sketches of 
some of their proofs, we point out that there are other reducibilities one can 
study in this context. Coles, Downey, and LaForte [14,15] introduced one such 
reducibility, called sw-reducibility, it is defined as follows. For sets of natural 
numbers A and B, we say that A B if there are a computable procedure B 
and a constant c such that = A and the use of B on argument x is bounded 
hy X + c. For reals a, (3 G (0, 1], we say that a /3 if there are sets A and B 
such that a = Q-Xa, (3 = 0-Xb, and A B. 

As in the case of Solovay reducibility, it is not difficult to argue that if a (3 
then B[{a \ n) < Bl{(3 [ n) + 0(1) for all n G cv. However, Coles, Downey, and 
LaForte [14] showed that Solovay reducibility and sw-reducibility are different, 
since there are c.e. reals a, (3, 7 , and 6 such that a P but a ^ 3 „ P and 7 ^ 3 „ 6 
but 7 and that there are no minimal sw-degrees of c.e. reals. 

Question B8. Is every random c.e. real sw-complete? 

Question 1.9. Are the sw-degrees of c.e. reals dense? 

Ultimately, the basic reducibility we seek to understand is iL-reducibility, 
where a A,h t if there is a constant 0(1) such that Bl{(j [ n) < iL(r [ n) + 0(1) 
for all n G tv. Little is known about this directly. 
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2 Main Results 

The following lemma, implicit in [32] and proved in [14], provides an alternate 
characterization of Solovay reducibility, which is the one that we will use below. 

Lemma 2.1. Let a and /3 he c.e. reals, and let ao,ai,... and be 

computable increasing sequences of rationals converging to a and (3, respectively. 
Then (3 oc if a'nd only if there are a constant d and a total computable function 
f such that for all n € u), 



[3 f3j^n) ^ . 

Whenever we mention a c.e. real a, we assume that we have chosen a com- 
putable increasing sequence ao,ai,... converging to a. The previous lemma 
guarantees that, in determining whether one c.e. real dominates another, the 
particular choice of such sequences is irrelevant. For convenience of notation, we 
adopt the convention that, for any c.e. real a mentioned below, the expression 
as — as-i is equal to ao when s = 0. 

We begin by sketching the proof that every incomplete Solovay degree can 
be split over any lesser Solovay degree. 

Theorem 2.2. Let j <s a <s LI be c.e. reals. There are c.e. reals (3^ and j3^ 
such that 7 <s (3^ ,(3^ <s a and + (3^ = a. 

Proof Sketch. We want to build /3° and (3^ so that 7 (3^ ct and (3^+f3^ = 

a, while satisfying the following requirement for each e,k € lo and i < 2: 

Ri^e,k ■ <Te total ^ 3n{a - ^ fc(/3* - P^)) . 

It is not hard to check that, since 7 <s a, there are a constant c and a computable 
increasing sequence 70, 71, . . . of rationals converging to 7 such that js — 7s- 1 < 
c(«s — tts-i) for all s G oj. Since multiplying a c.e. real by a positive integer does 
not change its Solovay degree, we may assume without loss of generality that 
2(js - 7s-i) < «s - Q!s-i for all s G w. 

Most of the essential features of our construction are already present in the 
case of two requirements Ri^e.k and R\-i^e',k', so we limit our discussion to this 
case. We assume that Ri^e.k has priority over and that both and 

T>e' are total. We will think of the p3 as being built by adding amounts to them 
in stages. Thus PI will be the total amount added to p3 by the end of stage s. 
At each stage s we begin by adding 7^ — 7s-i to the current value of each /3-1; in 
the limit, this ensures that P^ 7- 

We will say that Ri^e.k is satisfied through n at stage s if <l>e{n)[s\ J, and 
Us — a,p^i^n) > k{Pl — Plf). The strategy for Ri^e,k is to act whenever either it is 
not currently satisfied or the least number through which it is satisfied changes. 
Whenever this happens, Ri^e,k initializes R\-i^e',k', which means that the amount 
of a — 27 that Ri-i^e',k' is allowed to funnel into /3* is reduced. More specifically, 
once Ri-i^e' ,k' has been initialized for the mth time, the total amount that it is 
thenceforth allowed to put into /3* is reduced to 2“™. 
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The above strategy guarantees that if R\-i^e' ,k' is initialized infinitely often 
then the amount put into /3* by R\-i^e' ,k' (which in this case is all that is put 
into /3* except for the coding of 7 ) adds up to a computable real. In other words, 
/3* =s 7 <s a. But it is not hard to argue that this means that there is a stage s 
after which Ri^e,k is always satisfied and the least number through which it is 
satisfied does not change. So we conclude that R\-i^e' ,k' is initialized only finitely 
often, and that Ri^e,k is eventually permanently satisfied. 

This leaves us with the problem of designing a strategy for that 

respects the strategy for Ri^e,k- The problem is one of timing. To simplify no- 
tation, let d = a — 27 and dg = — 2^s- Since R\-i^e',k' is initialized only 

finitely often, there is a certain amount 2 “™ that it is allowed to put into /3* 
after the last time it is initialized. Thus if waits until a stage s such 

that d — dg < 2 “™, adding nothing to /3* until such a stage is reached, then 
from that point on it can put all of d — dg into /?*, which of course guarantees 
its success. The problem is that, in the general construction, a strategy working 
with a quota 2“™ cannot effectively find an s such that d — dg < 2“™. If it uses 
up its quota too soon, it may find itself unsatisfied and unable to do anything 
about it. 

The key to solving this problem (and the reason for the hypothesis that 
a <s n) is the observation that, since the sequence flo, fli, • . • converges much 
more slowly than the sequence do, di, . . . , we can use to modulate the amount 
that Ri-i^e',k' puts into /3b More specifically, at a stage s, if i?i_i,e',fc'’s current 
quota is 2 “™ then it puts into /3* as much of dg — dg_i as possible, subject to 
the constraint that the total amount put into /3* by R\-i^e' ,k' since the last stage 
before stage s at which Ri-i^e',k' was initialized must not exceed 2“’"rig. It can 
be shown that the fact that >s a implies that there is a stage v after which 
Ri-i,e',k' is allowed to put all of d — d^ into /3b 

In general, at a given stage s there will be several requirements, each with 
a certain amount that it wants (and is allowed) to direct into one of the /3b 
We work backwards, starting with the weakest priority requirement that we are 
currently considering. This requirement is allowed to direct as much of dg — dg_i 
as it wants (subject to its current quota, of course). If any of dg — dg_i is left 
then the next weakest priority strategy is allowed to act, and so on up the line. 

□ 

It is also possible to show that the Solovay degrees are upwards dense, that 
is, that if 7 <s n is a c.e. real then there is a c.e. real /3 such that 7 <s /3 <s fl. 
We omit the proof in the interest of space. Together with the previous theorem, 
this result implies that the Solovay degrees are dense. 

Theorem 2.3. The Solovay degrees of c.e. reals are dense. 

We finish by sketching a proof that the hypothesis that a <s in the state- 
ment of Theorem 2.2 is necessary. This fact will follow easily from a stronger 
result which shows that, despite the upwards density of the Solovay degrees, 
there is a sense in which the complete Solovay degree is very much above all 
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other Solovay degrees. We begin by noting the following lemma, which gives a 
useful sufficient condition for domination. 

Lemma 2.4. Let f be an increasing total computable function and let k > 0 be 
a natural number. Let a and (3 be c.e. reals for which there are infinitely many 
s € to such that k{a — a^) > P — Pf(s), but only finitely many s G to such that 
k{at — tts) > Pf(t) — Pf{s) for all t > s. Then P ct. 

Theorem 2.5. Let a and P be c.e. reals, let f be an increasing total computable 
function, and let k > 0 be a natural number. Lf P is random and there are 
infinitely many s G to such that k{a — Og) > P — Pf{s) then a is random. 

Proof Sketch. By taking . . . instead of /3o, /3i, ■ • • as an approximating 

sequence for P, we may assume that / is the identity. If a is rational then we 
can replace it with a nonrational computable real a' such that a' — a'g ^ a — as 
for all s G w, so we may assume that a is not rational. 

We assume that a is nonrandom and there are infinitely many s G to such 
that k{a — Og) > P — Ps, and show that P is nonrandom. The idea is to take a 
Solovay test A = {li : i G to} such that a G p for infinitely many i G to and use 
it to build a Solovay test B = {Ji : i G to} such that P G Ji for infinitely many 
i G to. 

Let 

U = {s G w : k{a — «g) > P — Ps} ■ 

It is not hard to show that U is A^, except in the trivial case in which P =g a. 
Thus a first attempt at building B could be to run the following procedure for 
alH G w in parallel. Look for the least t such that there is an s < t with s G U[f\ 
and as G li. If there is more than one number s with this property then choose 
the least among such numbers. Begin to add the intervals 

[Ps,Ps + k{as+i - «g)], [Ps + k{as+i - as),Ps + fc(«g+2 - «g)], ... (*) 

to B, continuing to do so as long as s remains in U and the approximation of a 
remains in L. If the approximation of a leaves L then end the procedure. If s 
leaves U, say at stage u, then repeat the procedure (only considering t ^ u, oi 
course). 

li a G li then the variable s in the above procedure eventually assumes 
a value in Lf. For this value, k{a — Og) > P — Ps, from which it follows that 
k{au — as) > P — Ps for some u > s, and hence that P G [Ps,Ps + k{au — Q^s)]- 
So P must be in one of the intervals (*) added to B by the above procedure. 

Since a is in infinitely many of the Li, running the above procedure for all 
i G to guarantees that P is in infinitely many of the intervals in B. The problem 
is that we also need the sum of the lengths of the intervals in B to be finite, and 
the above procedure gives no control over this sum, since it could easily be the 
case that we start working with some s, see it leave C/ at some stage t (at which 
point we have already added to B intervals whose lengths add up to at-i — as), 
and then find that the next s with which we have to work is much smaller than 
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t. Since this could happen many times for each i G to, we would have no bound 
on the sum of the lengths of the intervals in B. 

This problem would be solved if we had an infinite computable subset T of 
U. For each Ij, we could look for an s G T such that as G li, and then begin 
to add the intervals (*) to B, continuing to do so as long as the approximation 
of a remained in (Of course, in this easy setting, we could also simply add 
the single interval [fig, (is + k\I\] to B.) It is not hard to check that this would 
guarantee that if a G /i then (i is in one of the intervals added to B, while also 
ensuring that the sum of the lengths of these intervals is less than or equal to 
k\Ii\. Following this procedure for all i G w would give us the desired Solovay 
test B. Unless [3 however, there is no infinite computable T C [/, so we 

use Lemma 2.4 to obtain the next best thing. 

Let 

S' = {s G w : Vt > s{k{at - a^) > (it- A)} ■ 

If /3 ck then (i is nonrandom, so, by Lemma 2.4, we may assume that S is 
infinite. Furthermore, S is co-c.e. by definition, but it has the additional useful 
property that if a number s leaves S at stage t then so do all numbers in the 
interval (s,t). 

To construct B, we run the following procedure Pi for all i G w in parallel. 
Note that i? is a multiset, so we are allowed to add more than one copy of a 
given interval to B. 

1. Look for an s G w such that «« G /*. 

2. Let t = s + \ . li at ^ li then terminate the procedure. 

3. If s ^ S[t] then let s = t and go to step 2. Otherwise, add the interval 

[(ig + k{at-i - ag),(ig + k{at - a^)] 
to B, increase t by one, and repeat step 3. 

This concludes the construction of B. It is not hard to show that the sum of 
the lengths of the intervals in B is finite and that (i is in infinitely many of the 
intervals in B. □ 

Corollary 2.6. If a and (i are c.e. reals such that a + /i is random then at least 
one of a and (3 is random. 

Combining Theorem 2.2 and Corollary 2.6, we have the following results, the 
second of which also depends on Theorem 1.4. 

Theorem 2.7. A c.e. real 7 is random if and only if it cannot he written as 
a + (3 for c.e. reals a, (3 <s 7 . 

Theorem 2.8. Let d he a Solovay degree. The following are equivalent: 

1. d is incomplete. 

2. d splits. 

3. d splits over any lesser Solovay degree. 
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Abstract. We study k-partition communication protocols, an extension of the 
standard two-party best-partition model to k input partitions. The main results 
are as follows. 

1 . A strong explicit hierarchy on the degree of non-obliviousness is established 
by proving that, using fe 4- 1 partitions instead of k may decrease the commu- 
nication complexity from 0{n) to ©(log k). 

2. Certain linear codes are hard for fc-partition protocols even when k may be 
exponentially large (in the input size). On the other hand, one can show that 
all characteristic functions of linear codes are easy for randomized OBDDs. 

3. It is proven that there are subfunctions of the triangle-freeness function and 
the function 0 CLIQUE^ 3 which are hard for multipartition protocols. As an 
application, truly exponential lower bounds on the size of nondeterministic 
read-once branching programs for these functions are obtained, solving an 
open problem of Razborov [17]. 



1 Introduction 

The communication complexity of two-party protocols was introduced by Yao [18]. 
The initial goal was to develop a method for proving lower bounds on the complexity 
of distributed and parallel computations. In the meantime, communication complexity 
has been successfully applied as a tool for proving lower bounds in various other models 
of computation (see, e. g., [7, 12] for a survey). 

Let /: {0, 1}" ^ {0, 1} be a Boolean function defined on a set X of n Boolean 
variables, and let II = (Xi,X 2 ) be a balanced partition of X, i. e., a partition with 
-1< IX 1 I- IX 2 I < 1. 

* The work of the first and second author has been supported by DFG grant Hr 14/3-2, and of 
the fourth author by DFG grant We 1066/9-1. 
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A deterministic two-party communication protocol P for f according to II is an 
algorithm by which two players, called Alice and Bob, can evaluate / as follows. At 
the beginning of the computation, Alice obtains an input x: X\ — > {0, 1} and Bob an 
input y. X 2 {0) !}• Then the players communicate according to P by exchanging 
messages. The players may use unbounded resources to compute their messages. At the 
end, one of them has to output f{x, y). A nondeterministic protocol allows each player 
to access a (private) string of nondeterministic bits as an additional input. It is required 
that there is an assignment to the nondeterministic bits such that the protocol outputs 1 
if and only if f{x, y) = 1. 

The complexity of a nondeterministic protocol P is the maximum of the number of 
exchanged bits taken over all inputs, including the nondeterministic bits. The nondeter- 
ministic communication complexity of f according to II, ncc (/, II), is the minimum 
complexity of a nondeterministic protocol according to II which computes /. Finally, 
the (best-partition) nondeterministic communication complexity of f, ncc{f), is de- 
fined as the minimum of ncc (/, II) over all balanced partitions II of the set of input 
variables of /. 

A protocol is oblivious because it uses only one partition of the set of input variables 
for all inputs. Most applications of communication complexity are therefore restricted 
to oblivious models of computation. However, Borodin, Razborov, and Smolensky [5] 
succeeded in deriving exponential lower bounds for the non-oblivions model of com- 
putation of (syntactic) read-fc-times branching programs. Their approach leads, from 
the perspective of communication protocols, to the following notion of multipartition 
communication protocols [8] : 

Definition 1. Let / be a Boolean function defined on a set X of Boolean variables, and 
let /c be a positive integer. A k-partition protocol P for / is a collection of k nondeter- 
ministic (sub-)protocols Pi, . . . , Pk, each Pi with its own balanced partition of X, such 
that f — Pi V P 2 V • • • V Pfc, where we use Pi also to denote the function computed 
by protocol Pi. If rrii is the number of submatrices of Pi (i. e., rrii is the number of 
leaves in the protocol tree of Pi), then the complexity of P is riog(X]^=i trii)). The 
k-partition communication complexity of f, k-pcc (/), is the minimum complexity of a 
fc-partition protocol computing /. The multipartition communication complexity of f is 
mpcc{f) := min{k-pcc (f) \ k G N}. 

To better understand the model of multipartition communication, we compare 
mpcc (/) with the best -partition nondeterministic communication complexity ncc (/). 
Let /: {0, 1}” ^ {0, 1} be a Boolean function, A C /“^(l), and let 7T be a parti- 
tion of the variables of /. Define the distribution pA on {0, 1}” by pa{x) := if 
X G A, and pa{x) '■= 0 otherwise. Define B\ jj(f) := log(l/maxM pa{M)), where 
the maximum extends over all all-1 submatrices M of the communication matrix of / 
according to II. 

We have ncc (/, II) = max^c /-i(i) nif) + 0(log n) by the proof of The- 
orem 2.16 in [12], and consequently ncc(f) = minu max^c/-i(i) 77 (/) + 
0(log n), where the minimum extends over all balanced partitions II of the variables 
of /. A similar argument yields: 
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Lemma 1. For every Boolean function /: {0, 1}" ^ {0, 1}, 

mpcc{f) = max mini?^ nil) + O(logn). 

When dealing with multipartition eommunieation eomplexity, the notion of reet- 
angles as introdueed by Borodin, Razborov, and Smolensky [5] is useful. Let X be 
a set of n variables and let II = (Jfi, W 2 ) be a balaneed partition of X. A funetion 
r: {0,1}" ^ {0,1} defined on X is ealled a rectangle (with respect to II) if it ean 
be written as r = A where the funetions r* depend only on variables from Xi, 
i = 1,2. Given a Boolean funetion / defined on X, its rectangle complexity R{f) is the 
minimal number t for whieh there exist t reetangles ri , r 2 , . . . , r* (eaeh with its own 
partition of the variables in X) sueh that / = ri Vr 2 V • • • Vr*. The k-partition rectangle 
complexity Rk{f) of / is the minimal number of reetangles needed to eover / under 
the restrietion that these reetangles may use at most k different partitions. Note that 

Rkif) = min + i?i(/ 2 ) + • • • + Riifk), 

where the minimum is taken over all /c -tuples of Boolean funetions fi, f 2 , ■ ■ ■ , fk with 
/i V /2 V • • • V /fc = /. Furthermore, i?(/) = minfc Rk(f)- We obtain: 

Proposition 1. For all Boolean functions f, 

[log i?fc (/)! = k-pcc (/) , and [log i?(/)] = mpcc (/) . 

The measure R{f) ean also be used to prove lower bounds on the size of nonde- 
terministie read-once branehing programs (1-n.b.p. for short): Borodin, Razborov, and 
Smolensky [5] have shown that every Boolean funetion / requires a 1-n.b.p. of size at 
least In faet this lower bound is R{f) /(2n) for n-input funetions / due to an 

observation of Okolnishnikova [15]. 

The goal of this paper is to develop lower bounds for the fundamental measures 
mpcc (/) and R{f), resp., and apply these results to branehing programs. In the follow- 
ing, we give an overview on the paper. 

1. In [8], an exponential gap between ncc{f) = l-pcc{f) and 2-pcc{f) has been 
shown. In Seetion 2 (Theorem I), we prove that for infinitely many n and for all 
k = k{n), there is an explieitly defined funetion fk,n ■ {0, 1}" ^ {0, 1} sueh that, 

k-pcc (fk,n) = 12(n), and {k + l)-pcc (fk,n) = 0(log k). 

Thus, a small inerease of the degree of non-obliviousness ean result in an unbounded 
deerease of eommunieation eomplexity. 

2. In Seetion 3, we observe that an argument from [9, 15] yields a linear lower bound 

on the multipartition eommunieation eomplexity of the eharaeteristie funetion of a 
rancfom linear eode. Moreover, m/7cc (BCH„) ^ logi?(BCH„) = for the 

eharaeteristie funetion of BCH-eode of length n and designed distanee d = 2t I 
with t « (Theorem 2). 

On the other hand, we prove that the eomplement of a linear eode ean be computed 
by small randomized OBDDs with arbitrarily small one-sided error (Theorem 3). 
Thus we obtain the apparently best known tradeoff between randomized and nonde- 
terministic branching program complexity. 
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3. In Section 4, we consider the problem of determining whether a given graph has no 
triangles. The corresponding triangle-freeness function An has n = (™) Boolean 
variables (one for each potential edge) and accepts a given graph G on m vertices if 
and only if G has no triangles. We prove that there is a subfunction A'^ of Z\„ with 
R{A'n) = (Theorem 4). 

Although this result does not imply a lower bound on the rectangle complexity (and 
thus the multipartition complexity) of the triangle-freeness function Z\„ itself, the 
result has an interesting consequence for nondeterministic read-once branching pro- 
grams. Razborov ([17], Problem 1 1) asks whether a truly exponential lower bound 
holds for the function 0 Clique„ 3 on n = (™) variables which outputs the par- 
ity of the number of triangles in a graph on m vertices. In the case of determinis- 
tic read-once branching programs, such a lower bound for 0 Clique„ 3 has been 
proven by Ajtai et al. in [2]. We solve this problem by proving that nondeterministic 
read-once branching programs for 0 Clique„ 3 and for the triangle-freeness func- 
tion An require size at least The only other truly exponential lower bounds 

for nondeterministic read-once programs have been proven for a class of functions 
based on quadratic forms in [3-5]. In the deterministic case, the recent celebrated 
result of Ajtai [1] gives a truly exponential lower bound for a function similar to 
0 Clique„ 3 even for linear time branching programs. 

2 A Strong Hierarchy on the Degree of Non-ohliviousness 

The goal of this section is to prove that allowing one more partition of the input variables 
can lead to an unbounded decrease of the communication complexity for explicitly 
defined functions. 

Theorem 1. For infinitely many n and all k = k{n), there is an explicitly defined 
function fk^n ■ { 0 , 1 }" ^ { 0 , 1 } such that, 

k-pcc ifk,n) = 17(n), and {k 0 l)-pcc (fk,n) = 0 (log k). 

Furthermore, the upper bound can even be achieved by using (k+\)-partition protocols 
where each protocol is deterministic. 

We describe how the functions used in the proof of Theorem 1 are constructed. The 
idea is to take some function h which is known to be “hard” even if arbitrarily many 
partitions are allowed. From h, a new function fk is constructed which will be “easy” 
for (fc 0 1 ) -partition protocols, but “hard” for fc-partition protocols. 

For h: {0, 1}™ — > {0,1}, the respective function fk is defined on vectors of 
variables x = (xi, . . . , X 2 m), y = (j/o, ■ • ■ , ye-i), and z = (zq, ■ • ■ , ze-i), where 
£ := [log(A: 0 1)] . We use a fixed set V = {7T}, . . . , of balanced partitions of 

the x-variables (described later on). For a given value i from {1, . . . , fc 0 1} represented 
by the y-variables, the vector x is divided into two halves x^{i), x^(i) of length m 
according to the partition Ilf The function fk is defined by fk{x, y, z) := h(x^{i)). 
(Observe that the z-variables are only used for “padding” the input.) 
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It is obvious that fk has {k + l)-partition protocols of small complexity: 

Proof of Theorem 1 - Upper Bound. The protocol for fk uses k + I partitions which 
divide the cc-vector according to the partitions in V, and which give all t/-variables to 
the first player and all z-variables to the second player. In the ith subprotocol, the first 
player outputs h(x^{i)) if i is the value represented by the y-variables, and 0 otherwise. 
The second player does nothing. The complexity of the whole protocol is obviously 
riog(2(fc+i))i = riog(fc+i)i + 1. □ 

In the following, we can only give an outline of the proof of the lower bound. We 
first describe the main combinatorial idea. If we can ensure that all the sets occurring 
as halves of partitions in V (where \V\ = fc + 1) are “very different,” then the partitions 
in V cannot be “approximated” by only k partitions, as the following lemma shows. 

Lemma 2. Define the (Hamming) distance between two sets A,B C {1, . . . , n} £>y 
d(A, B) := I A n i?| + |A n B\. Let A and B be families of subsets o/{l, . . . , n} with 
D ^ d(A, A') ^ n — D for all different A, A' G A and D j2 Gi \B\ ^ n — D /2 for 
all B G B. If\A\ \B\ + \ then there exists an Aq G A and S G {^Iq, Aq} such that 
D/4 < ISTi D| ifn-D/LforallBG B. 

Proof We first show that there is an Aq G A such that D/2 ^ d(Ao, B) ^ n — D/2 
for all B G B. Assume to the contrary that for each A G A there is a B G B such 
that d(A,B) < D/2 or d(A, B) = n — d(A, B) < D/2. Since |^| ^ \B\ + 1, the 
pigeonhole principle implies that there exists B G B such that d(Si, B) < D/2 and 
d(S 2 ,B) < D/2 for some S'! G {Ai,Ai}, S 2 G {A 2 , A 2 } and A\ A 2 G A. But 
then d(5'i, S' 2 ) < d{Si,B) + d{B, S 2 ) < D, a contradiction. 

Now fix some B G B. Define the real-valued 2x2 matrix M = {rrirs) by setting 
mil := 1^0 n B\, mi 2 := |Aq n B\, ni 2 i := \Aq n D|^and ni 22 ■= |^o H B\. 
We have mu + mi 2 = \B\ ^ D/2 and m 2 i + m 22 = \B\ ^ D/2. Furthermore, 
mil + 17122 = d{Ao,B) D/2 and mi 2 -F m 2 i = d{Ao, B) ^ D/2. It follows that 
there is at least one column of M for which both elements are at least D/4. □ 

In order to meet the requirements of Lemma 2, we choose V such that the charac- 
teristic vectors of the H* form a code C C {0, 1}^™ with the following properties: (i) 
All X G C have exactly m ones and m zeros, i. e., C is a so-called balanced code, (ii) 
Any two different codewords have Hamming distance at least D = 25m and at most 
2m — D = 2(l — b)m, b>0a constant. To construct a code with these properties and 
exponentially many codewords, we start with a Justesen code (see, e. g., [13]), which is 
a linear code with appropriate lower and upper bounds on the weight of its codewords, 
and then “balance” the codewords by “padding.” 

Let n* = (D*i, 7T*2), for i = 1, . . . , fc -F 1. Let TTi = (TTi^i, 
for i be arbitrary balanced partitions. We apply Lemma 2 to 

A= {n/^\i = 1, . . . ,k + 1} and D = {W n TTi^i | i = 1, . . . , A}, where X = 
{xi, . . . , X 2 m}- This yields an index io such that at least one half of the partition H*^ 
has at least D/4 variables on both sides of all partitions Hi,i = 1, . . . , fc. It is now easy 
to prove the following. 

Lemma 3. Let (3 := D /(Am) = 5/2. There are partitions H[, . . . , 77/ of the variables 
of h which are fi-balanced, i. e. |77' |7T' 2 I ^ [/3mJ for i = 1, ... ,k> <^tid a k- 

partition protocol for h with these partitions which has complexity at most k-pcc (fk)- 
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To obtain the desired lower bound for fk, we require an explicitly defined func- 
tion h which has large multipartition complexity even if the given partitions are only 
/3-balanced for some small constant (3 > 0. A linear lower bound of this type is con- 
tained, e. g., in the results of Beame, Saks, and Thathachar ([4], Lemma 4) or in [11]. 

3 The Multipartition Communication Complexity of Linear Codes 

A (binary) code of length n and distance d is a subset of vectors C C {0, 1}" for which 
the Hamming distance between any two vectors in C is at least d. The following lemma 
is implicit in [9, 15], where a stronger version has been used to show that linear codes 
are hard for read-fc-times branching programs: 

Lemma 4 ([9, 15]). Let C C {0,l}"fiea code of distance 2t + 1. Let P be a mul- 
tipartition protocol computing the characteristic function of C. Then P uses at least 

log • 2“"^ bits of communication. 

The number of codewords and the distance of random linear codes are known to 
meet the Gilbert- Varshamov bound [13]. As a consequence, the above lemma gives 
linear lower bounds for the characteristic functions of such codes. To give a constructive 
example, we consider binary BCH-codes with length n = 2™ — 1 and designed distance 
d = 2t-|-l; such a code has at least 2"/ (n-fl)* vectors and distance at least d. Let BCH„ 
be the characteristic function of such a BCH code with t « Using Lemma 4, we 
obtain: 

Theorem 2. Each multipartition protocol for BCH„ has complexity at least 

On the other hand, all linear codes have small randomized communication com- 
plexity even in the fixed-partition model (we omit the easy proof): 

Proposition 2. Let fc be a characteristic function of a linear binary code of length n. 
Then the two-party fixed-partition one-round bounded error communication complexity 
of fc is 0(1) with public coins and 0(log n) with private coins. 

The characteristic functions fc of linear codes are known to be hard for different 
models of branching programs, including /c-n.b.p.’s - nondeterministic read-fc-times 
branching programs where along any path no variable appears more than k times [9], 
and ( 1 , -|-A:)-b.p.’s - deterministic branching programs where along each consistent path 
at most k variables are allowed to be tested more than once [10]. On the other hand, 
the negation ~^fc is just an OR of at most n scalar products of an input vector with 
the rows of the corresponding parity-check matrix. Hence, for every linear code, the 
characteristic function ~^fc of its complement has a small nondeterministic OBDD (an 
OBDD is a read-once branching program where the variables along every path appear 
according to a fixed order). We can strengthen this observation even to randomized 
OBDDs with one-sided error. 

Theorem 3. Let C C {0, 1}" be a linear code and let fc be its characteristic function. 
Then, for every integer r ^ 2, ~^fc can be computed by a randomized OBDD of size 
with one-sided error at most 2“’’. 
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Sketch of Proof. Let H be the m x n parity-check matrix of C. Let w be chosen uni- 
formly at random from {0, 1}". The essence of the construction is the simple fact that 
w^Hx = 0 mod 2 for a; G C, whereas Prob [w^Hx ^ 0 mod 2] =1/2 for x ^ C. 
We cannot use this representation of fc directly to construct a randomized OBDD, 
since this OBDD would require exponentially many probabilistic nodes to randomly 
choose the vector w. 

To reduce the number of random bits, we apply an idea which has appeared in 
different disguises in several papers (see, e. g., Newman [14]): By a probabilistic ar- 
gument it follows that, for all S with 0 < b < 1/2, there is a set kL C {0, 1}" with 
\W\ = 0(^n/S^) such that for w chosen uniformly at random from W and all x ^ C, 
Prob [w^Hx ^ 0 mod 2] ^ 1/2 — S. Choose 5=1/5 and let W be the obtained set 
of vectors. 

Let G be the randomized OBDD which starts with a tree on [log \ W\~\ probabilis- 
tic variables at the top by which an element ru G IP is chosen uniformly at random. 
At the leaf of the tree belonging to the vector w, append a deterministic sub-OBDD 
which checks whether w^Hx = 0 mod 2. By the above facts, this randomized OBDD 
computes ~^fc with one-sided error at most 7 /lO. The size of G is bounded by O(n^) . 

To decrease the error probability, we regard G as a deterministic OBDD on all vari- 
ables (deterministic and probabilistic ones). Applying the known OBDD-algorithms, 
we obtain an OBDD G' for the OR of 2r copies of G with different sets of probabilistic 
variables. This OBDD G' has one-sided error at most (7/10)^’’ < 2“’’ and size 0(n^’’). 

□ 

Apparently, this result gives the strongest known tradeoff between nondeterministic 
and randomized branching program complexity. 



4 A Lower Bound for Triangle-Freeness 

The triangle-freeness function An is a function on n = (™) Boolean variables (encod- 
ing the edges on an m-vertex graph) which, given a graph G on m vertices, accepts it if 
and only if G has no triangles. The function 0 Clique„ 3 has the same set of variables 
and outputs the parity of the number of triangles in G. 

Theorem 4. There is a subfunction A/ of An such that R{A/) = 2^^"/ The same 
holds also for 0 Clique„ 3. 

This result is sufficient to prove that each nondeterministic read-once branching 
program detecting the triangle-freeness of a graph requires truly exponential size. Since 
by assigning constants to some variables, we can only decrease the branching program 
size, the desired lower bound on the size of any 1-n.b.p. computing Z\„ follows directly 
from Theorem 4 and the fact that each Boolean function / on n variables requires a 
1-n.b.p. of size at least i?(/)/(2n) (as mentioned in the introduction). We obtain the 
following main result which also answers Problem 1 1 of Razborov from [17]. 

Theorem 5. Nondeterministic read-once branching programs for the triangle-freeness 
function An as well as for 0 Clique„ 3 require size 
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Remark. Using a similar probabilistic argument, the following has recently been 
proven in [ 11 ]: (i) (ii) provided k < 2‘^'^ 

for a sufficiently small constant c > 0; and (iii) there is a constant C > 0 such that 
syntactic nondeterministic read-fc-times branching programs, detecting the absence of 
4-cliques in a graph on m vertices, require size at least ). Moreover, it is 

shown that Theorem 4 remains true also for /3-balanced partitions, for all constants (3 
with 0 < /3 < 1 / 2 . 

4.1 Outline of the Proof of Theorem 4 

We give the details only for Z\„ and discuss the changes required for 0 Clique„ 3 at 
the end of this section. To define the desired subfunction of Z\„, we consider graphs 
on m vertices partitioned into sets U = {1, . . . , m/2} and V = {m/2 + 1, . . . , m|. 
The subfunction A'^ will depend only on variables corresponding to the edges in the 
bipartite graph U x V; the variables corresponding to the edges within the parts U and 
V will be fixed. Hence, A'^ will still have m^/4 variables. 

The proof consists essentially of two parts: First, we probabilistically construct an 
assignment which fixes the subgraphs Gu and Gy on the vertex sets U and V. After 
fixing these graphs, we obtain a subfunction of Z\„ which depends only on variables 
belonging to edges in the bipartite graph Gb = U x V . We then consider only those 
partitions II which are balanced with respect to the bipartite (non-fixed) part. Our goal 
is to choose the graphs Gu and Gy such that none of them contains a triangle and the 
resulting graph G = Gu U Gy U Gb contains many triangles whose bipartite edges 
belong to different halves of a partition. 

A pair of edges in [/ x U is called a test, if they form a triangle together with an edge 
from Gu or Gy. Two tests are said to collide, if a triangle can be formed by picking 
one edge from the first test, one edge from the second test and an edge from Gu U Gy. 
In particular, tests collide if they share an edge. 

Given a balanced partition II = {Ei , i? 2 ) of the edges in [/ x V, say that a test is 
hard for II, if each part Ei of the partition contains one edge of the test. The following 
lemma about graph partitions is the core of our argument. 

2 

Lemma 5. Let TTi, . . . , Ilk be k ^ 2“™ balanced partitions ofU x V, where a > 0 
is a sufficiently small constant. Then there exist triangle-free graphs Gu and Gy such 
that the resulting graph G = Gu U Gy U Gb has a set T of tests such that T does not 
contain any colliding pairs, and T contains a subset Ti of 12 hard tests for each 
Ili, i = 1, . . . ,k. 

Let us first show how this lemma implies the theorem; we will then sketch the proof 
of the lemma itself 

Choose Gu and Gy according to the lemma and let be the resulting subfunction 
onU X V. We construct a set A of hard inputs for which will already require many 
rectangles to be covered. Edge variables outside of T are fixed to 0 for all inputs in 
A. For each test in T, we then choose exactly one edge and set the respective variable 
to 1, the second one is set to 0. Thus, the graph corresponding to an input in A has 
precisely one of the two edges of each test in T, and two graphs differ only on edges 
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in T. Since no two tests in T collide, the graphs are triangle-free and we obtain a total 
of graphs. Hence, |H| = 

Now let functions /i, . . . , /fc be given with Z\'„ = /i V • • • V /fc, fc < 2“™^, and 
Ri(fi) = Rki^'n), and let ili, . . . , 77^ be the partitions corresponding to op- 
timal covers of fi, fk by rectangles. Then there is at least one function fi with 
|/“^(1) n A\ ^ l^l/fc = 2l^l/^- By Lemma 5, there is a set C T of 7 = 
tests which are hard for the partition 77^. Let B C n H be a set of maximum 

size such that two different inputs from B differ in at least one bit corresponding to a 
test in R. Then \B\ \ f~\l) n H|/2l'^l-'^ ^ 2^/A:. 

Since all the inputs from B are accepted by fi, it remains to show that no rectangle 
r ^ fi with the underlying partition 77^ can accept more than one input from B. Assume 
that (a, b) and (o', b') are two different inputs in B accepted by r. By the choice of 77, 
they differ in a test t = {ei, 62} which is hard for 77^, i. e., whose edges belong to 
different halves of the partition II i. By the definition of A, exactly one of the two edges 
ei and 62 is present in each of the graphs belonging to (a, b) and (o', b'), resp., and 
these edges are different. Now, if r(a, b) — 1, then r{a, b') = 0 or r{a' , b) = 0 because 
either the graph corresponding to (a, &') or to {a',b) will contain both edges 61,62, 
which, together with the corresponding edge of Gu or Gy, forms a triangle. This is a 
contradiction to the fact that r is a rectangle. Altogether, we have completed the proof 
of the lower bound for Z\^. 

Changes for 0 Clique„ 3. We consider the subfunction 0 CLIQUE^ 3 which is 
obtained from 0Clique„ 3 in same way as A'^ from Z\„. Let t := \T\. For x,y £ 
{0, 1}*, define IPt{x,y) := 2. Define the set A of hard inputs for 

0 Clique^ 3 as follows: For all {x, y) £ IP^^(l), include the input obtained by setting 
variables outside of T to 0 and setting the two edge variables of the 7th test in T to Xi 
and yi, resp. Then |A| = | IP^^(1)| ^ 2^*“^ and A C 0 CLIQUE”^!). 

Following the proof for A'^, we obtain a set 7? of at least jk inputs from 

A which are hard for one of the partitions Hi in a cover of 0 CliqueJ^ 3. Using the 
well-known fact that |r“^(l)| ^ 2* for each rectangle r ^ IPt, one easily proves that 
no rectangle r' ^ 0 CliqueJj 3 can contain more than 2* inputs from 77. Thus, at least 
2^“^/fc rectangles are needed to cover 77. □ 

4.2 Sketch of Proof for Lemma 5 

Recall that a test is a pair of edges mU x V which form a triangle together with an 
edge in Gu or Gy, and that a test is hard with respect to a partition 77 if its two edges 
lie in different halves of 77. 

Lemma 6. There exist graphs Gu and Gy such that: 

(i) each of the graphs Gu and Gy has 0{m) edges, at most 0(1) triangles, and at 
most 0{m) paths of length 2 or 3; and 

(ii) for every balanced partition II of U x V, there are h = I2(mf) tests which are 
hard for 77. 
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Sketch of Proof. We prove the existence of the desired graphs by a probabilistic argu- 
ment. In what follows, let Gu (Gv) stand for the random graph on U (resp., on V) 
obtained by inserting the edges independently at random with probability p = 0{l/m) 
each'. Using Markov’s inequality, it is easy to show that the graphs Gu and Gv have 
the properties described in Part (i) of the lemma with probability at least 1 /2. It remains 
to prove that, with probability larger than 1/2, for every balanced partition of U x V, 
there are at least hard tests. 

Let n be such a balanced partition. The partition II distributes the edges mU xV 
to two sets of size m^/8 each which are given to the players Alice and Bob. Call a 
vertex mixed if each of the two players have at least | • ^ bipartite edges incident to it. 

Claim I There are 12{m) mixed vertices in each of the sets U and V. 

Proof of the Claim. We use essentially the same argument as Papadimitriou and Sipser 
in [16]. W. 1. o. g., assume that we have at most em mixed vertices in V, where e > 0 is 
a sufficiently small constant {e < 1/112 works fine). Call a vertex v an A-vertex (resp. 
B-vertex) if Alice (resp. Bob) has at least | • ^ edges incident to v. Thus, vertices which 
are neither A- nor B-vertices are mixed. Observe first that the number of A-vertices as 
well as the number of B-vertices in each of the sets U and V is at most bmax •= | ' T’ 
since otherwise Alice or Bob would have more than m^/8 edges. On the other hand, the 
number of A-vertices as well as the number of B-vertices in V is bounded/rom below 
by bmin •= f ■ T ~ since otherwise there would be more than em mixed vertices 
in V, contrary to the assumption. 

Now more than half of the edges from A-vertices in U to B-vertices in V belong to 
Alice, because otherwise there will be an A-vertex u G U such that Alice has at most 
half of the edges from u to B-vertices in V, and thus altogether at most | • bmax + I — 
bmm = |- |'T + T-(f-T - ^rn) < ^ + em < I ■ ^ edges incident to u. 

With the same reasoning, however, more than half of all edges from A-vertices in U to 
B-vertices in V belong to Bob. Contradiction. □ 

For each mixed vertex u G U, let Va{u) {Vb{u)) be the set of vertices v G V fox 
which Alice (resp. Bob) has the edge {u,?;}. Since u is mixed, \Va{u)\, |Vb('u)| ^ i-^. 
Observe that each edge between Va{u) and Vb{u) leads to a hard test with respect to 
the given partition 77. 

Claim 2. The following event has probability larger than 1/2 with respect to the ran- 
dom choices o/Gv-' For all pairs of disjoint sets 81,82 C V of size at least mjl6 
each, the number of edges in Gv between 81 and 82 is at least p\ 8 i\\ 82 \/ 2 . 

Proof of the Claim. The expected number of edges between fixed sets of vertices 
8 \ and 82 is p|5'i||S'2|. By Chemoff bounds, the true number of edges is at least 
pi's'll 1 5'2 1/2 with probability at least 1 — e”®™, where the constant c > 0 can be 
adjusted by the choice of the constant in the definition of p. Since there are at most 
(2’"/^)^ _ choices for the sets 81,82 ‘A V, the probability of the described event 
is at least 1 — 2™ • 6“°™, which is larger than 1 /2 for appropriate c. □ 

* For the sake of simplicity, we omit the exact constant in the definition of p here. 
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We apply the claim to the sets Va(u) and Vb(u), where m is a mixed vertex, gen- 
erated by all balanced partitions II. Due to the claim, the event that, for each partition 
n and all sets Va{u) and Vb{u) generated by II, these sets are connected by at least 
p\Va{u)\\Vb{u)\/ 2 = 12{m) edges, has probability larger than 1/2. Thus, with proba- 
bility larger than 1 /2, for each partition 77 there are 17 (m^) hard tests generated by the 
I7(m) mixed vertices. This completes the proof of the lemma. (Observe that it does not 
matter whether we carry out the above argument for mixed vertices in U or in V.) □ 

We apply Lemma 6 and fix graphs Gu and Gy with the described properties. Since 
there are only 0(1) triangles, we can remove these triangles without destroying the 
other properties. Especially, we still have linearly many edges. By Property (ii), this 
pair of graphs produces a set of h. = 17(m^) hard tests Ti for each of the partitions 77^ 
{i = 1, ... ,k) from a given multipartition protocol for Z\„. 

Let To be the set of all tests induced by Gu and Gy, and let t = |To| be its size. 
Since both graphs Gu and Gy have 0{m) edges, t = f7(m^). Using the properties of 
these graphs stated in Lemma 6 (i), it is easy to show (by case analysis) that at most 
0{t) of all ( 2 ) pairs of tests in Tq will collide: 

Lemma 7. There are at most 0{t) pairs of colliding tests in Tq. 

To finish the proof of Lemma 5, it remains to find a subset T C Tq such that: 
(i) there is no pair of tests from T which collide; and (ii) |T n T^j = for all 

i = 1, . . . ,k. We again use a probabilistic construction. Let T be a set of s tests picked 
uniformly at random from the set Tq, where s = jt and 7 is a constant with 0 < 7 < 1 
chosen later on. 

Lemma 8. 

( i) With probability at least 1 /2, the set T contains at most Oi^s^ /t) pairs of colliding 

tests (where t = |To| is the total number of tests). 

(ii) With probability larger than 1/2, |T H Tj| ^ ^ for all i = 1, . . . , k. 

Proof. Part (i): We define the collision graph to have tests as vertices and edges for 
each collision. Let c be the number of edges in the collision graph. By Lemma 7, we 
know that c = 0{t). 

Let ct be the number of edges in the subgraph of the collision graph induced by the 
randomly chosen set T. Since we pick tests uniformly at random, the expected number 
of edges is E [ct] = • c. By Markov’s inequality, it follows that the actual number 

of edges is at most 2 • E [ct] with probability at least 1/2. Hence, the number of pairs 
of colliding tests in T is at most 2 • E [ct] = 0((s/t)^ • c) = 0[s‘^/t) with probability 
at least 1/2. 

Part (ii): Consider a fixed partition 77^. The probability to choose a hard test from 
Ti is h/t, t = the total number of tests. Thus the expected number of elements 

in T n Ti for a randomly chosen set T of s tests is s ■ h/t. Let A := h/{2t). By 
Chemoff bounds, it follows that Prob [|T n T^j < A • s] < Hence, 

the probability that T contains at least A- s = sh/{2t) hard tests for each of the partitions 
at least 1 — k ■ Since s = yt = 0{rnf), this probability is larger than 1/2 for 

k ^ 2“™ with a > 0 sufficiently small. □ 
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Lemma 8 yields the existence of a set T C Tq with the following properties: 
(i) |T| = s = 7f; (ii) there are at most 5s^ /t pairs of tests in T which collide, 5 > 0 
some constant; and (hi) for alH = 1, . . . , A:, |T n T^j ^ sh/ (2f). 

By deleting at most 5s^ /t tests from T, we remove all collisions, obtaining a smaller 
set T' . The number of hard tests for each Ili in T' is still sh/(2t) — Ss'^/t = (s/t) ■ 
{h/2 — 6s) = 7 • {h/2 — Sjt). Since this number is of the order for 7 = 

h/{4:St) = 0(1), we have completed the proof of Lemma 5. □ 
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Abstract. One of the fundamental properties of a graph is the number of distinct 
eigenvalues of its adjacency or Laplacian matrix. Determining this number is of 
theoretical interest and also of practical impact. Graphs with small spectra exhibit 
many symmetry properties and are well suited as interconnection topologies. Es- 
pecially load balancing can be done on such interconnection topologies in a small 
number of steps. In this paper we are interested in graphs with maximal degree 
0(log n), where n is the number of vertices, and with a small number of distinct 
eigenvalues. Our goal is to find scalable families of such graphs with polyloga- 
rithmic spectrum in the number of vertices. We present also the eigenvalues of 
the Butterfly graph. 



1 Introduction 

Spectral methods in graph theory have received great attention since their introduction 
and have proved to be a valuable tool for the theoretical and applied graph theory [7,3]. 
A (Laplacian or adjacency) matrix is associated to each graph. The set of the eigenval- 
ues of this matrix is called the (Laplacian or adjacency) spectrum of the graph; it is one 
of the most important algebraic invariants of a graph. Although in general a graph is not 
characterized uniquely by its spectrum, there is a strong connection between the eigen- 
values and many structural properties of the graph (diameter, bisection width, expansion 
etc). See [5] for a selection of results in this area. 

An important parameter connected to a spectrum of a graph is its size, i.e. the num- 
ber of distinct eigenvalues of the adjacency (or Laplacian) matrix of the graph. This 
value is correlated to the symmetry properties of it: the only graph having two distinct 
eigenvalues is the complete graph and its automorphism group is as rich as possible - 
the symmetric group. Graphs having three distinct eigenvalues are called strongly regu- 
lar; their diameter is 2 and they posses many interesting properties [16,3]. Another well 
studied class of highly symmetric graphs are distance-regular graphs [4]. The size of 
their spectrum is 1 -F diam{G) which matches a lower bound for all graphs. 

In the past there have been written several papers about well structured graphs. 
Consider for example the hypercube Q{d) as a vertex and edge symmetric graph. It has 
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2"^ vertices and a diameter resp. vertex degree of d. The hypercube has d + 1 distinct 
eigenvalues and a large application as an interconnection topology. Other graphs as 
cliques, complete bipartite graphs or the star have only 2 or 3 distinct eigenvalues, but 
because of their high density are ill-suited as interconnection topologies. 

There exists some graphs with an even better relation between number of vertices, 
vertex degree, diameter and number of eigenvalues than the hypercube. One of them is 
the Petersen graph, which has 10 vertices, a vertex degree of 3, diameter 2 and 3 dif- 
ferent eigenvalues. Another one is the Cage{6, 6), which has 62 vertices, vertex degree 
6, a diameter of 2 and only 3 distinct eigenvalues. A family of graphs with a very good 
behavior is the family of the star graphs [2, 1]. The star graph of order d has dl vertices, 
a vertex degree of d— 1, diameter | (d— 1) and using a previous work of Flatto, Odlyzko 
and Wales it turns out that it has only 2d — 1 distinct eigenvalues [12]. 

In this paper we focus our attention on constructing scalable families of sparse 
graphs (maximal vertex degree 0(log n) where n is the number of vertices) with small 
spectra. We use the term scalable to denote a family of graphs, which contains for each 
natural n an n-vertex graph. Our motivation for studying this question comes from the 
area of load balancing in distributed systems. Let there be given an arbitrary, undirected, 
connected graph G = {V, E) in which node w G IT contains a load of w{v). The goal is 
to determine a schedule to move load across edges so that finally the load on each node 
will be the same. In each step load can be moved from any node to its neighbors. Com- 
munication between non-adjacent vertices is not allowed. This problem describes load 
balancing in synchronous distributed processor networks and parallel machines when 
we associate a node with a processor, an edge with a communication link and the load 
with identical, independent tasks [6,18]. 

Load balancing algorithms are typically based on a fixed topology which defines 
the load balancing partners in the system. Consider for example a bus system where 
each processor can communicate with any other processor in the network. To avoid 
high communication costs, we allow any processor to communicate only with a small 
number of other nodes in the system. Then we can define a topology, which has a 
small vertex degree and supports fast load balancing on the network. See also [8] for a 
practical point of view of this problem. 

Now the load balancing process can be split into two phases, the flow computation 
phase, which computes the network flow, and the migration phase, which migrates the 
load items according to the computed flow. Algorithms for the flow computation phase 
have been extensively studied. Many of them are local iterative schemes based on dif- 
fusion or dimension exchange [10,1 1,13,14,15,18]. The diffusion algorithms studied in 
the above mentioned papers calculate an / 2 -optimal flow. We are interested in topolo- 
gies, for which the optimal scheme OPT [11] (see also the next section) has a small 
number of iteration steps (polylogarithmic in the number of vertices). Applying the 
optimal scheme we need only m—\ iterations where m is the number of distinct eigen- 
values of the Laplacian of the graph. In any iteration step a node has to communicate 
with all of its neighbors, so the total cost of the load balancing algorithm depends on the 
number of the distinct eigenvalues of the graph and on its vertex degree. The number 
of steps is in fact the product of both. Therefore we are interested in topologies with a 
small product of the vertex degree and of the number of distinct eigenvalues. 
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The paper is organized as follows. In section 2 we present the definitions and lem- 
mas used in this paper to compute the eigenvalues of the graphs constructed below. In 
section 3 we compute the spectrum of the Butterfly graph and propose scalable families 
of trees of constant degree whose spectrum consists of 0(log^ n) different eigenvalues. 
Since the tree topology is not well suited for our application we present a scalable fam- 
ily of well connected graphs with at most 0(log^ n) distinct eigenvalues and a vertex 
degree of 0(log n) . In the last section we improve the previous results for the case of the 
adjacency spectrum using another technique to obtain 0(log^ n) distinct eigenvalues, 
where the vertex-degree still remains 0(log(n)). 

Concerning the product of the vertex degree and the number of distinct eigenvalues, 
the star graphs are the best graphs we know. In their case this product is 0(( 
where n is the number of vertices. We found scalable families of graphs wit§ a good 
behavior, but we did not reach this bound. We also do not know, if this bound is optimal. 
The only lower bound is f?(log n). So there are a lot of open problems which are left to 
be solved in this important field. 



2 Definitions and Lemmas 

In this paper we are interested in scalable families of graphs. A family of graphs Q is 
called scalable, if for each n G IN there is an n-vertex graph G G Q. The identity matrix 
will be denoted I„ G M"^". Symbols J^.n, denote m x n matrices containing all 
ones and all zeros, respectively. The spectrum of a matrix A is the set of its eigenvalues: 
Sp{A) = {A I 3a; : Ax = Aa;} 

The operation “0” denotes the Kronecker product, for the matrices A G IR™^", 
B G the matrix A0 B G is the matrix obtained from A by replacing 

every element aij by the block OijB. 

Consider a (weighted) digraph G = {V, E) with w{e) being the weight of an edge 
e. The adjacency matrix of G is the matrix Aq = (oy )i<i j<\v\ where = w{eij) 
if an edge leads from a vertex Vi to a vertex Vj and = 0 otherwise {an is the 
weight of a self-loop in Vi). The Laplacian matrix of G is the matrix Aq = D — Aq 
where D — (dij) is a diagonal matrix with entries da = Oij. The spectrum of 
adjacency and Laplacian matrix of a graph G will be denoted Spa{G) and Spa{G) 
and called adjacency and Laplacian spectrum of G, respectively. Note that for d-regular 
graphs the adjacency and Laplacian spectrum are equivalent. The Laplacian spectrum 
of a d-regular graph consists of all values \a = d — A^, where Aa is an eigenvalue of 
the adjacency matrix and d is the vertex degree of the graph. 

We are looking in this paper for graphs where the optimal scheme OPT has a small 
number of iteration steps. The optimal scheme is defined as follows. Let Ai , A 2 , . . . , Am 
be the m nonzero distinct eigenvalues of the Laplacian of the graph. Now, in the t-th 
iteration step each vertex Vi sends a load of to its neighbors, where w\ is the load 
of vertex Vi after the iteration step t — 1. So in the t-th iteration step, the load of the 
vertex v has the form 

= w*{v) — ^ ~ w*{u)) 

{v.u}aE ^ 
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After m steps, the load of the network will be totally balanced (see [11]). Note that this 
implies that the diameter of the graph is less than or equal to m — 1 (see also [7]). 
Following there is a well known lower bound on the size of the spectrum: 

Lemma 1. The number of distinct eigenvalues of the adjacency and Laplacian matrix 
of an undirected connected graph G with n vertices and maximal degree dis ^ ^ . 

Proof We have seen above that for the diameter of G it holds diam{G) < \Spa{G) \ — 
1 (e.g. [7]). By an argument known as Moore’s bound: a graph with maximal degree d 
and diameter diam{G) can have at most 1 + d + d{d — 1) + • • • + d{d — 
vertices, the lemma follows. For the case of the adjacency spectrum a similar argument 
holds ([7]). □ 



In our approach to the construction of graphs with small spectra we shall use the 
following lemmas. The intuition behind is that having a graph G with the correspond- 
ing (adjacency or Laplacian) matrix A, we transform A using a suitable non-singular 
matrix X into a block-diagonal form. The spectrum of A is given by the union of 
the spectra of the particular block matrices. It is often convenient to view these block 
components as matrices of some simpler graphs. 

Lemma 2. Let n = p- m + r, Ag IR"^" be a matrix of the form 

C \ 

Jp,i ® Rip® B + (Jp^p - Ip)® X ) 



where C G S G R G and B,X G M™’’”. Then the spectrum of 

A can be written as the union 

Sp(A) = Sp(B - X) U Sp ( ^ ) (1) 

Proof Consider the matrices 



W = 



Ir 0 

0 [7(g)I« 



,U = 



1 f Ip— 1 Jp— 1,1 

y/p 1 



and 



— 1 I f {p I) ' Ip— 1 Jp— i,p— 1 Jp— 1,] 



U~^ = 



Vp V 



Ji,p— 1 



Using the transformation Sp{A) = Sp{W AW ) we get 



( 2 ) 

( 3 ) 



( C 0 ^-S 

Sp{A) = S'p 0 Ip_i ®{B-X) 0 

\y/p-R 0 B + {p—l)X 

The lemma follows by interchanging the second and the third row and column block. 

□ 
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As an example we present the case p = 2. Taking into consideration, that in this 
work we consider only symmetric matrices, S = holds and the matrix A has the 
form 



fCR^R^\ / C V2-R^ 0 \ 

I i? B X I and is transformed by Lemma 2 to j \/2 ■ R B + X 0 1 

\R X B J V 0 0 {B-X)J 

The matrix A above can be viewed as the adjacency matrix of a graph, which is con- 
structed from a core (with the adjacency matrix C) and 2 copies of a graph with the 
adjacency matrix B. The subgraph B appears 2 times in the graph and therefore some 
eigenvalues appear also more than once. 

From now on we will not distinguish between the notation of a matrix and an edge 
weighted graph. The vertices of A are connected to some vertices of both copies of B 
and these edges are represented by the matrix R. Now the spectrum of this is the union 
of the spectrum of two other graphs. The first is constructed from C and one copy of B, 
where the matrix \[2 ■ R represents the edges between C and B + X. The second graph 
will be i? — X. 

In the following sections we consider I or 0 as the matrix X. We define Q as the 
smallest matrix with the property: R^ = ( 0 ) . Note that has the same number 
of rows as the matrix C. Since not any vertex of B will be connected to C, the number 
of columns of equals to the number of vertices of B, which have an adjacent vertex 
in C. 

We denote with B, Q,p) resp. with T 2 .a{C, B, Q,p) the graph described 

by Ti.a resp. by T 2 ,a, where 

r C 3i^p®R^\ r C \ 

\3p,i(E)R Ip^B J’ \^Jp,l(g)i^Ip(g)B + (Jp,p-Ip)(g)Iy ■ 

In the following we define a sequence of graphs, where each graph is of the form 
Ti^a{C, B,Q,p) or T 2 ,a{C, B,Q,p). Note, that R describes Q in a unique way, so 
Ti^a is well-defined. 

Definition 1. Let {Gk)i<k<oo be a sequence of graphs defined as follows. There exists 
a sequence of matrices {Ck)i<k<oo,{Qk)i<k<oo resp. of integers {pk)i<k<oo for any 
\ < k < oo, where Gk = Ti^A{Ck,Qk,Gk-i,Pk)- Qk represents the edges between 
the core Ck of Gk and the core Ck-i of Gk-i; Qk has the same number of rows as 
Ck and the same number of columns as Ck-i. There are no edges between Ck and 

Gk-i \ Ck-i- 

The sequence {G'f)i<k<oo is defined in a similar way as {Gk)i<k<oo- For any 1 < 
fc < oo there exists a sequence of matrices {C'f)i<k<oo, {Q'k)i<k<oo and a sequence 
of integers {pk)i<k<oD so that G'f. = 7^^A(C'fc, Qfc; Gfc-uPfc) where Q'^. represents the 
edges between the core G'f. of G'f. and the core C'^_i ofG'f._f^. There is no edge between 
C',andG',_,\C'^. 

To show how to calculate the eigenvalues of the graphs G„ resp. defined above 
we have to define the graph class Ma{Ci,Qi^ 2 , C 2 , Q 2.3 • ■ • , G„). 
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Definition 2. Let Ci,C 2 , ■ ■ - Ck be a sequence of matrices with \ < k < oo. Let 
Qi,2, Q2,3, ■ ■ ■ , Qk-i,k be also a sequence of matrices. 

AiA{Ci, C2, • ■ • , Qk-i,n, Ck) denotes the graph with a block tri-diagonal 
adjacency matrix of the form 



( Gk 


Qk-l,k 


0 


... 0 \ 


Qk—l.k 


Ck-1 


Qk-2,k-l ' ' ■ 


... 0 


0 


Q k—2.k—l 


(M 

1 

02 


... 0 


0 


0 


0 


C2 QT 2 


V 0 


0 


0 


Qi,2 Cl / 



In the following lemma we show that the eigenvalues of Gk defined in definition 1 
can be reduced to the eigenvalues of some graphs Ma defined in definition 2. 

Lemma 3. The spectrum of Gk = Ti^A^Ck, Qk, Gk~i,Pk) is the union of the spectra 
of n graphs Mi, , Mk, where Mi = MA{Ci, y/p^-Qi, Ci-i, y/p^A-Qi-i,- --jCi) 
for any 1 < i < k. 

Furthermore the spectrum of G'f. = T2^A{C'f.,Q'f.,G'f._^,p) is the union of the 
spectra of some graphs Mij, 1 < i < k and 1 < j < k — i for i k and 

j = I for i = k. The graph Mtj = MA{CijA,y/p ■ Q), CijA, ■■ ■, C'tj.t-i), where 
= C'_i + {i-k + p{j - 1) + l{p - 1)) • I. 

Proof The first statement of the lemma can be proved by induction. We can apply 
lemma 2 on Gk and obtain, that its eigenvalues are the union of the spectra of Gk-i 
and of the graph where = MA{Ck,^/Pk ■ 

Qk, Ck-i). As an example we present the matrix after applying 2 times lemma 2 on Gk 
for Pk -2 = 2 and obtain 

/ Cfc ^Ql 0 0 0 \ 

s/PkQk y^Pk—lQf^—i 0 0 

0 y^Pk—lQk—l Ck—2 R]^—2 ^k—2 

0 0 Rk-2 Gk-3 0 

V 0 0 Rk-2 0 Gk-3 / 

Now we assume that if we apply i times lemma 2, the eigenvalues are the union of the 
spectra of Gk-i ■ ■ ■ Gk-i and of the graph 

'rlAClk-l,...,k-^,Q*k-^, Gk-^-l,Pk-i), 

with 

G kA—l,. . . ,k—i — fd a(^C k , \/Pk ' Qk, Gk—1, • • ■ , -\J Pk— 1 ' Qk—lGk—i) 

and Ql^i = ^ qt ^ • After applying lemma 2 once again we obtain, that the spec- 
trum of Gk can be obtained from the spectra of Gk-i ■ ■ ■ Gk-i-i and of the spectrum 
of the graph 

'Gl.A{GkA—l.....k—i—l^ Q k-i—l") Gk—i—2,Pk—i—l), 
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where 

~ J^AiCki \/ Pk ' Qk^ C'fc— Ij ■ • ■ j C’fc— i— l) 

and = \ ) . After k such steps we obtain the first statement of the 

\Qk-i-i ) 

lemma. The second statement can be proved in a similar way. □ 

A similar lemma can be also formulated for the Laplacian eigenvalues of such 
graphs. 

3 Spectra of Scalable and Non-scalable Topologies 

In this section we deal with some known families of graphs used as interconnection net- 
works. We begin with the Butterfly network, which is designed to have many favorable 
properties for distributed computing, such as small maximal degree and diameter, large 
connectivity etc. Next, we turn our attention to scalable families of graphs. First we 

construct the graphs with O ^ distinct eigenvalues of both adjacency 

and Laplacian matrix. Since trees are not well suited as interconnection topologies we 
define a new graph class with much better network properties. To compute the eigen- 
values, we use the technique presented in the previous section. 

3.1 The Spectrum of the Butterfly Graph 

The spectral properties of some interconnection networks like rings, tori, hypercubes 
[7] or DeBmijn graphs [9] have been investigated. Now we present the adjacency and 
Laplacian spectrum of the Butterfly without wrap-around edges. 

The Butterfly graph BF(k) consists of fc -F 1 columns, each column containing 2^ 
vertices labeled with unique binary strings of length k. An edge connects two vertices 
in BF(^k) if ^nd only if they are in consecutive i-th and {i -F l)-st columns and their 
labels are either equal or differ only in the i-th bit. 

Theorem 1. The adjacency spectrum of the Butterfly graph BF(^k) ^ 

SpA{BF(k)) = jdcos I 1 < * < j < fc + l| 

Proof. The Butterfly graph BF(^k) can be recursively constructed by taking two copies 
of BF(^k-i), adding one column of 2* vertices and connecting these vertices to both 
BF(fc_i)’s. Thus the adjacency matrix of the Butterfly graph BF(^k) has the structure 

from Lemma 3 of Ti^A{Ck, BFk-i,Qk,2) with Ck — 0 and Q'^ = Us- 
ing Lemma 3, we get SpA{BF(^k)) = {Sp{Mi)}. The matrix Mi can be viewed 

as an adjacency matrix of a weighted binary tree with all edges of weight \/2. Af- 
ter permuting the vertices. Lemma 3 can be used again with Ck = (0), pk = 2 and 
Qfc = (72). Thus SpA{BF^k)) = uto{5p(M,)} = {^p(M')}. The 
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matrix Af' can be viewed as an adjacency matrix of a weighted path with j + 1 
vertices where all edges have weight 2, thus using [7] Sp{M'^) = 25'p(Pj+i) = 

4|cos(^j^) I ;= □ 

Using similar techniques the Laplacian spectrum of the Butterfly can be also com- 
puted. 

Theorem 2. The Laplacian spectrum of the Butterfly graph is 

SpA{BF(k)) = |4 - 4cos I 0 < i < j < /tj U 

U jd - 4cos + 1^^ ) I 1 - * - 7 < 

3.2 Adjacency and Laplacian Spectrum of Scalable Families of Graphs 

Now we turn our attention to scalable families of graphs. In the sequel, n will always 
denote the number of vertices. Let us deflne the graph class follows. 

Definitions. The tree T(^d,n) is a rooted tree defined recursively. T(^d,i) contains only 
the root vertex. For n < d+1, T(^d,n) is a star with one root and n — 1 leaves connected 
to it. For n > d, T(^d,n) = is constructed as follows. Let s = and q = 

n—l — ds. Let (Vi, i?i ),..., (V^, Ed) be d disjoint copies ofT(^d,s) with respective roots 

d 

ri,...,rd. Then U = {r} U {wi, ..., v,} U IJ U and E = {{r,Vi) \ i = l...g} U {(r, r^) | 

d 

I = U U Ei. 

i=l 

Informally, constructing a tree Tid.n) involves setting one vertex r as a root, then 
dividing the remaining n — 1 vertices evenly and constructing a number of copies of 
T(d,s)- The roots of these copies are connected to r. Remaining vertices are added as 
vertices with degree 1 connected to r. 

Remark 1. The graph „) has n vertices and the maximal degree at most 2d+ \. 

Theorems. The graph has at most O ^ different eigenvalues. 

The proof of this theorem follows from Lemma 3 and it will be omitted because of 
space limitations. 

The choice of the parameter d results in graphs with different properties. Depending 
on the application, one may ask for either minimal number of eigenvalues or for the 
minimal value of deg • |S'p|, where deg is the maximal degree of the graph. For getting 
a small number of eigenvalues, some reasonable large value of d should be chosen (e.g., 
d = 0(log n)), and for setting a small value of the product, some small value should be 
chosen (e.g., d = 2). 
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However, trees are extremely ill-suited for the application of load balancing because 
of their poor connectivity properties. To overcome this weak point, we give another 
construction of a graph iJ(„) with O ((logn)^) distinct eigenvalues which is much 
better suited as a topology for load balancing. 

Definition 4. Every graph iT(n) has a set of distinguished vertices (core) denoted by 
C'(iT(„)). The graph iT(n) is defined recursively as follows. iJ(i) = C{H(^i)) = K\, 
H{ 2 ) = C'(Tf( 2 )) = K 2 - For Tf(n) = (Y^E), n > 2 , let Tf(n) be of the form 

where 

„ _ J K 2 if n is even 
\ Ki ifn is odd ’ 



and Qn = J- 

Informally, to construct iT(„), first construct two copies of connect 

the corresponding vertices. The remaining 1 or 2 vertices of iT(n) form its core and are 
connected mutually and to all vertices from the cores of both (see Figure 

3.2). In consequence, iT(„) has a vertex degree of at most log n -F 5. 




Fig. 1. The Graph iT(i 7 ) . 



Theorem 4. The graph iT(n) has at most 0(log^ n) distinct eigenvalues. 

This theorem can be proved using Lemma 3 and it will be omitted because of space 
limitations. 
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4 Adjacency Spectrum 

In this section we improve the previous results for the case of the adjacency spec- 
trum. Using another technique we construct a scalable family of graphs G(„) with 
O ((logn)2) distinct eigenvalues. 

The graph G(„) is constructed by taking a hypercube, subdividing each edge and 
then replicating the vertices from the original hypercube a number of times (see Figure 

4). 

Definition 5. Let d {d > 0) satisfy the inequality 2'^~^{d -F 2) < n < 2'^{d + 3). Let 
{^i}i=o ^ sequence defined as follows: 

ro fori>[l\ 

I (n - 2<^-^{d + 2)) mod 2‘^ for i = [|J 
[ Z\,+i mod J forO<i< [|J 

where a mod b= a — b ■ [|J . 

Let Q{d) be a hypercube. We shall refer to the vertices as binary strings from 
{0, 1}'^. The k-th level is defined as = {x € {0, 1}"^ | = A:}. 

The graphG(^n) is defined as follows. Consider the graph S{Q{d)) with2‘^~^{d+2) 
vertices obtained from the hypercube Q(d) by subdivision of each edge. For each node 
X from the original graph Q{d) add a set Vx of isolated vertices of cardinality \Vx\ = 

_l_ n -2 ^^(d+ 2 ) j g y ^ edges (y, v) for all edges 

{x, v) from S{Q{d)). 

Now, the diameter of a hypercube Q{n) is log(n) and from the construction of the 
graph follows: 

Remark 2. The graph G(„) (n > 2) has n vertices and diam{G) < 2 log n. 

The maximal degree of G(„) follows also from its definition. 

Remark 3. The maximal degree of G(„) (n > 2) is at most 3 log n + o(log n). 

To compute the number of distinct eigenvalues of G(„) we need the further two 
lemmas. 

Lemma 4. [7] Let M be a non-singular square matrix, then 

= \M\ ■ \Q-PM~^N\ 

Lemma 5. Let G be a graph obtained from the hypercube Q(d) as follows. For every 
vertex x add a self-loop with weight w^, where x G Lk. Each edge connecting vertices 
on levels k, k -\- 1 has weight Wk • Wk+i- Then the graph G has at most 0{d^) distinct 
eigenvalues. 



M N 
P Q 
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The proof of this lemma is omitted because of space limitations. 

Theorem 5. The adjacency matrix ofG(^n) > 2) has 0{log^ n) dijferent eigenvalues 



Proof. Let A be the adjacency matrix of G(„). Consider an arbitrary vertex x from the 
original hypercube Q{d) together with all vertices from Vx- Let C be the adjacency 
matrix of a graph obtained from G(„) by removing vertices {x} U Vx- Clearly A is 
of the form of Lemma 2. After applying Lemma 2 iteratively to the sets {x} U Vx for 
each X € Q{d), we get a matrix A' with the same spectrum as A (except possibly the 
eigenvalue 0). The matrix A' can be viewed as the adjacency matrix of a weighted graph 
G' defined as follows. Consider a subdivided hypercube S{Q{d)). Each edge incident 
with a vertex x from Q{d) has a weight a/1 + \Vx\- 

/ 0 R^D\ 

Clearly, A! is of the form A' = j j , where R G is 

\DR 0 ) 

the vertex-edge incidency matrix of Q(d) and D G IR is a diagonal matrix 
D = (diSij).^ = + \Vi\Sij'^ . Now using Lemma 4 we get |AI — A'\ = 

= |AI| • |AI - DRj^R^D\ = . 1 1(^21 _ DRR^D)\ = 

A "^2 -2 . |^ 2 j _ Ugjjjg ([jg f^g^ RR^ = Q + dl, where Q is the 



AI -R^D 
-DR AI 
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adjacency matrix of hypercube Q{d), we can conclude that G(„) has at most 2m + 1 dis- 
tinct eigenvalues, where m is the number of distinct eigenvalues of a graph G” defined 
as follows. Consider a hypereube Q{d). For each A: = 0, ...,d, let Wk = \/l + |Vfc|. 
Add to eaeh vertex x from Q{d) a self-loop with weight w^, where x G £k- Each edge 
connecting vertices in levels k and k + I has weight Wk ■ Wk+i- Now using Lemma 5 
we can show, that G" has only 0(log^ n) distinct eigenvalues. 
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Abstract. We study preemptive scheduling on uniformly related pro- 
cessors, where jobs are arriving one by one in an on-line fashion. 

We consider the class of machine sets where the speed ratios are non- 
decreasing as speed increases. For each set of machines in this class, we 
design an algorithm of optimal competitive ratio. This generalizes the 
known result for identical machines, and solves other interesting cases. 
Keywords: Algorithms, scheduling. 



1 Introduction 

We consider on-line scheduling on m uniformly related machines. Jobs arrive on- 
line, and each job has to be assigned before the next job arrives. This scheduling 
model is called “scheduling jobs one by one” (see [9]). Preemption is allowed, 
hence each job may be cut into a few pieces. These pieces are to be assigned 
to possibly different machines, in non-overlapping time slots. (Non-preemptive 
algorithms are not allowed to cut the job and have to assign it continuously to 
one machine.) 

Each job j is associated with a weight w{j) and each machine i has a speed 
Si- The processing time of a job (or a part of a job) of weight w, on machine i is 
w/si- The machines are sorted so that Si < Si+i for 1 < f < m — 1 and Sm = 1- 
The last condition is general since it is possible to scale any set of speeds and 
job weights into this form. 

mm m 

For a given set of speeds, let x = ^ Si/(X) Si — 1). Note that ^ Si = 

i—1 i—1 i—1 

m 

and ^ Si - 1 = 

The load of machine i, Li, is equal to the processing times of all parts of 
jobs assigned to machine i, on this machine. 

The goal of an algorithm is to minimize the makespan, which is the maximum 
load on any machine. 

The quality of an on-line algorithm is measured by the competitive ratio 
that is the worst case ratio between Con which is the cost (the makespan, in our 
case) of the on-line algorithm and Copt, which is the cost of an optimal off-line 
algorithm which knows all the sequence in advance. 
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In this paper we solve the case of non-decreasing speed ratios, i.e. 
for 2 < i < m — 1. 

We give an algorithm of optimal competitive ratio for every set of speeds. 
Specifically, we design a deterministic preemptive algorithm of competitive ratio 
^ and a matching lower bound. The lower bounds are valid for 

(tc— 1) ^ SiX'^~'^ 
i=l 

deterministic or randomized preemptive algorithms. 

Note that non-decreasing speed ratios for related machines were already con- 
sidered by Vestjens [10]. He studied a different preemptive on-line scheduling 
model where jobs arrive over time, instead of one by one. He showed that for 
this model, an algorithm with competitive ratio 1, which used a finite number 
of preemptions can be given if and only if speed ratios are non-decreasing. 

Our results generalize a few previous results. Chen, Van Vliet and Woeginger 
gave a preemptive optimal algorithm for identical machines [3]. Some ideas of 
our results are based on that paper. They show that the best competitive ratio 
for identical machines is — (m — 1)™). 

A lower bound of the same value on the competitive ratio of non-preemptive 
randomized algorithms is also known. The proofs use similar sequences as the 
ones in [3] and were given independently by [2] and [8]. However, no optimal 
non-preemptive randomized algorithm is know for m > 3. (For m=2, such an 
optimal algorithm is given in [1].) 

Preemptive scheduling on two related machines was studied independently 
by [4] and by [1 1] . Both papers show that the optimal competitive ratio is 1 -I- 

Sl/(Si -|- Si -|- 1). 

Preemptive scheduling on related machines was also considered by Epstein 
and Sgall [5] . The paper gives a constant competitive algorithm for any m and 
set of speeds. That paper also gives a lower bound of 2 on the competitive ratio 
of any algorithm with an unbounded number of machines and specific lower 
bounds for constant values of m. Those lower bounds are valid for randomized 
preemptive or non-preemptive algorithms. Our lower bounds are the general 
case of the lower bound in [5] for unbounded m. Even though our result does 
not hold for non-preemptive algorithms, in Section 3 we mention some cases 
where it holds, and one of these cases is an exponential set of speeds (s^ = y™”* 
for some 0 < y < 1) which is also used in [5]. The tight competitive ratio in this 

• x'^ (x—y) 

case IS 7 — • 

We start the paper with definitions and proofs of the optimal algorithms and 
prove the lower bounds in Section 3. 



2 Algorithms 

We describe the preemptive algorithm. Note that it is easy to compute the 
optimal preemptive off-line load at every step. The formula was given by [7] and 
by [6]. The optimal load is the maximum of the following m values; the total 
weight of all jobs divided by the sum of speeds, and for 1 < j < m — 1, the 
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sum of weights of the largest j jobs divided by the sum of largest j speeds of 
machines. 

Our algorithm, similarly to [3] tries to maintain a ratio of x between loads 
of subsequent machines. 

We use the following notations; the load of machine I after the arrival of t 
jobs is denoted by L*. The optimal load at that time is denoted by and the 
sum of weights of the first t jobs is denoted by W*. 



m 

(x - 1) X] SiX^-^ 

i=l 

The algorithm maintains the following three invariants. 

— At any time t, L\ < L\ < ■ ■ ■ < L\^. 

— At any time t, < r ■ 

k E 

— At any time t, for every 1 < fc < m X ^ 

i = l 

A new job Jt+i (which arrives at time t + 1) is assigned as follows. The 
new optimal off-line is computed using its weight w{Jt+i). Then the following 
intervals are reserved. On machine m, the interval: 

Im = [Ll,r ; 

and on machine j (1 < j < m — 1), the interval Ij = [L* , Tj+i]- Those intervals 
are disjoint. The intervals relate to load and not to weight, the weight that can 
be assigned on Ij (1 < j < m — 1) is Sj(LXi ~ Tp. 

To assign Jj+i, go from Im to Ii, putting a part of the job, as large as possible 
in each interval, until all the job is assigned. After the assignment there will be 
some fully occupied intervals /z+i, • • • , Im, some empty intervals Ii, • • • , Iz-i and 
a partially or fully occupied interval Iz- 

Next, we show that it is always possible to partition a job among those 
intervals. 

For convenience define sq = 0 and Sm+i = 1- Then Si-i/si < Si/si+i holds 
for all 1 < i < m. 

Lemma 1. If the invariants are fulfilled at step t, then the reserved intervals 
are sufficient to assign Jt+i- 

Proof. The total weight that can be assigned to all intervals is 

m—1 

A= {r C'op/ ~ ^ (Tj+i — L*)sj 

j=i 

m 

clpt + 

i=i 



= r 
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= r 





Since Sj-i/sj < Sj/sj+i we can use the third invariant for each value of j and 
get that the above is at least 



A > 






i-1 






■ — 1 



i=i 






Sj Sj + i 



E' 



I] 



X 

X — 1 



— + I (X - 1 ) ^ - X™ I 1 U‘ 



We consider two cases: 

1. w{Jt+i) > and then (7*+/ > w{Jt+i). 

If Si 
2 = 1 

2. w{Jt+i) < ^ — and then C*t^ > W*+V I] Si- 

2 = 1 

We show that the assignment is successful in both cases. 

Case 1 : 

Since the term multiplied by W* is non-positive we can substitute W* < w{Jt+i) 

/ m \ 

I ^ Si — 1 1 and get that 



A > 



I] SiX*-i 



X — 1 






+ «;( Si - l)((x - 1) X! ^ ~ 






2=1 



simplifying this gives 



A > w(Jt+i). 



Case 2: ^ 

In this case, > l£Mi±il±AL^ Substituting this we get that the term multi- 

if Si 
2 = 1 

m 

plied by W* is + {x - 1) SiX^~'^ 

(2:-l) if Si i=l 

2=1 



, which is non-negative. By 
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using W* > w{Jt+i) ( I] Si - 1 ) we get 

^i=l 



A > 



KJt+i) 



Y. SiX 

i=l 



i-1 



1 



™ \x-l X-1 ^ 

^ Si V \^=l 

i=l 



^S, -1 



^Si-1 (x-1)^ 



„ „,i-l _ 

J j iX' Jb 



simplifying this also gives A > w(Jt+i). 

To complete the proof of the algorithm, we need to show that all invariants 
are kept after an assignment of a job. This is clear for the first two invariants, 
from the definition of the algorithm. 

Lemma 2. If invariants are fulfilled after step t, then they are also kept after 
step t + \ 

This would be sufficient since at the start, all loads are zero. 

Proof. We only need to show that the third invariant holds for every 1 < k < m. 

According to the definition of the algorithm, there exists a machine 2 : such 
that for i < z, = L* , for z < i < m, and L* < 

(for convenience = rClpl). 

If k < z, then 



=J2s^Ll<% 



Y 



i-1 



Y SiX 



-1 



w* < 



w 



t+1 



Y SiX*-l 



z; Sicc*-i 
2=1 



If k > z then we need to show 



Z ^ 

i=k+l 



E 

i=k+l Y SiX^~^ 



(1) 



Since k> z, = L*_|_i and the left hand size is equal to 



m 

E 
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since ^ 0 for oan use the invariants of step t, and get that 

this value is at least 



i=l 



i-1 



't+1 



: - 1 



^ S^X^ W* 



\i=k+l 



Let w(Jt+i) = then Cl'l^ = max < //, and = (1 — 



E 

i = l 



Simple calculations show that inequality (1) holds. 



3 Matching Lower Bounds 

To prove a matching lower bound, we use the following lemma, given in [5]. 

Lemma 3. Consider a sequence of at least m jobs, where Jq, Ji, • • • , Jm-i are 
the last m jobs, Let Copt(Ji) be the preemptive optimal off-line cost after the 
arrival of Ji . 

The competitive ratio of any preemptive randomized on-line algorithm is at 
least 

( m 

Y,S^Copt{J^-l) 

i=l 

where W is the total weight of all jobs in the sequence. 

Note that if we consider non-preemptive optimal off-line costs, the same 
expression lower bounds the competitive ratio of non-preemptive randomized 
algorithms. 

Next, we construct the lower bound sequence. The construction is somewhat 
similar to the proofs in [2,3,8]. 

m— 1 

The sequence starts with an amount of Y Sj — 1 of very small jobs (sand). 

i—1 

These jobs are followed by m — 1 jobs Ji, • • • , Jm-i where W{Ji) = 

Theorem 1. The competitive ratio of any preemptive randomized on-line algo- 
rithm is at least 

( m 

The proof of the theorem follows from Lemma 3 and the following lemma 

Lemma 4. The preemptive optimal off-line cost after the arrival of i big jobs is 
this is true for i = 0, ■ ■ ■ ,m — 1. 
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Proof. We start by proving the following claim 
Claim. For each 1 < k < m — 1 

i=l 

Let j be a maximum index such that Sj < ■ If no such index exists then 

j = 0. Note that j < m since Sm = 1- Hence for k > j, Sk > ^2-k ■ We consider 
two cases. 



E 



^ ^ ^m—k-\-i j E ^ 



k-l 






Case 1: j < m — fc + 1 

Then Sm-fc+i) > x’^~^ 

\i^l J 2=1 2 = 1 

Case 2: j > m — k 1 

Since — = i then by induction for all p < j, Sp < p^. 

Hence 



k 

^ ^ ^m-k-\-i 
2=1 



^5, 



> 



X 



X — 1 



m—k 

E 



2 = 1 



m—k 




X 



X — 1 



1 1 
^k-l X — 1 









The lemma is satisfied for t = 0, since splitting the sand evenly gives 



E s* - 1 



Copt — 



i=l 



E Si 

i=l 



1 

X 



For t > 0 let LF* be the total weight of jobs arriving no later than Ji. 
Copt after the arrival of Ji is the maximum between 



LFV E ^nd max E W(J„) I / I E ■Sp 1 • The first value is 

i=i i<fc<* \p=fc ) \p=^^-k ) 

According to the claim, each term in the second value is at most This 

proves the lemma. 



Note that if only case 1 of the claim occurs, i.e. for all i > 2, Si> , then 
the optimal off-line does not use preemptions, and then the lower bound is valid 
for randomized non-preemptive on-line algorithms. 
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This is the case for two basic sets of machines. 

1 . All machines but the slowest have speed 1 . The lower bound value in this case 

is — 1 + (x — l)(si — 1)). This gives the result for m identical machines 

for Si = 1 and for m — 1 identical machines for si = 0. (see [3]). 

2. Machines speeds are powers of some number 0 < y < 1, i.e. Si = In this 

case X = yZym hence J/ > | - The lower bound is x®"(x — y)/((x — l)(x”® — j/™)) 
which tends to (j/ + 1) as m goes to infinity. This lower bound is given in [5]. 

4 Conclusions and Open Problems 

We have given optimal algorithms for a class of uniformly related machines. 
It would be interesting to give optimal algorithms for other classes. It is also 
unknown what the best competitive ratio for general related machines is. There 
is a large gap between the lower bound of 2 given in [5] , and the algorithm given 
there which has competitive ratio above 20. 
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Abstract. The UPS Problem consists of the following: given a vertex 
set V, vertex probabilities and distances I : ^ that sat- 

isfy the triangle inequality, find a Hamilton cycle such that the expected 
length of the shortcut that skips each vertex v with probability 1 — Pv 
(independently of the others) is minimum. This problem appears in the 
following context. Drivers of delivery companies visit customers daily to 
deliver packages. For the company, the shorter the distance traversed, 
the better. For a driver, routes that change dramatically from one day to 
the other are inconvenient; it is better if one only has to shortcut a fixed 
route. The UPS problem, whose objective captures these two points of 
view, is at least as hard to approximate as the Metric TSP. Given that one 
of the vertices has probability one, we show that the performance ratio 
of a TSP tour for the UPS problem is 1/pmin, where Pmin := min„gvp„. 
We also show that this is tight. Consequently, Christofides’ algorithm 
for the TSP has a performance ratio of 3/(2pmin) for the UPS problem 
and the approximation threshold for the UPS problem is at most 1/pmin 
times the one for the TSP. 



1 Introduction 

1.1 Motivation 

Package delivery companies, like the United Parcel Service (UPS), have to de- 
liver packages daily to several of their customers. The order of delivery is chosen 
so that to minimize the distance traversed by the drivers. Each delivery concerns 
only a subset of the customers. Therefore each delivery could be optimized indi- 
vidually. It is, however, easier for a driver to shortcut a fixed route than to travel 
each time a completely different route. In this paper we study a variation of the 
Traveling Salesman Problem (TSP) which captures the issue described above. 

* Research partially done while at Humboldt-Universitat zu Berlin, supported 
in part by CAPES/DAAD Proc. 089/99, CNPq Proc. 301174/97-0, FAPESP 
Proc. 96/04505-2 and ProNEx 107/97 - MCT/FINEP (Brazil). 

** Research supported in part by Deutsche Forschungsgemeinschaft, Pr296/6-l. 
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The delivery company has the information on how often each customer re- 
ceives a package. From this information one can estimate the probability that 
a customer receives a package per day. Roughly speaking, the here called UPS 
Problem consists of the following: find an ordering of all customers that mini- 
mizes the expected length of the route that starts and ends at a company loca- 
tion and visits in this ordering a randomly chosen (according to the customer’s 
estimated probabilities) subset of the customers. The problem, as well as the de- 
scribed application, was proposed by [8]. Also according to [8], even the special 
case where customers are divided into two clusters — the customers who receive 
packages often and the ones who receive packages not so often — is of interest. 
The setup is conceivable in other delivery systems as well. We therefore expect 
the study of this problem to have several applications. 

1.2 Notation and Problem Statement 

Let G = (V, E) be the complete graph on n vertices. A path is a sequence 
{vq, Vi, , Vk) of distinct vertices of G. A cycle is a sequence {vq, v\, . . . , Vk) of 
vertices of G, where ... , Vk-i are distinct and Vk = vq. For a path (cycle) 

P = {vq, Vi, ... , Vk), we denote by V (P) the set {uq, > Vk} and we say that 
P is a path (cycle) on V{P). A tour (or a Hamilton cycle) is a cycle on V. 

We denote the set {(uq, ^^i), (vi, U 2 ), • ■ • , (^^fc-i, w^)} by E{P). The length of 
P with respect to a function I : — > P'*' is denoted by 1{P), and is given by 

1{P):= ^(e). 

eeE{P) 

The Traveling Salesman Problem (TSP) is the following: given a complete 
graph G = {V, E) and a function I : V'^ —> P'*", find a tour of minimum length. 
We refer to such a tour as a TSP tour. 

Unless specified otherwise, we consider in the following only functions I : 
V'^ that satisfy the triangle inequality: for any x,y,z in V, l{x,y) < 

l{x,z) + l{z,y). Under this condition, TSP is called Metric TSP. 

Note that I may be given partially. Then the length of an edge is considered to 
be the infimum of the lengths of all paths between its end vertices. This closure 
satisfies the triangle inequality. 

Given a path P and a subset S of V{P), the shortcut of P induced by S, 
denoted by scs{P), is the path on S given by the subsequence of P containing 
exactly the vertices of S. Similarly we can define the shortcut of a cycle C induced 
by a subset S of V{G) and denote it by scs{G). 

Assume each vertex w in U has an associated probability Pv and let p := 
(pv)vev- These probabilities induce a probability distribution on the subsets of 
V: for each S CV, 

Pr)^] := Y[pv 

ves v^s 

Let p and q be two sets of vertex probabilities. We say that p dominates q if 
Pv > qv for all V G V. 
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Given a cycle C, denote by /sCp(C) the expected value of the length of the 
shortcut of C induced by a vertex subset which is randomly chosen according to 

P: 

/sCp(C) := J2Pr[S]l{scs{C)). 
scv 

Now we are ready to state the UPS problem: 

Definition 1. Given the complete graph G = (V,E), a function I : V'^ ^ i?“*" 
satisfying the triangle inequality, and probabilities p = (p„)t,gy, the UPS prob- 
lem asks for a tour C that minimizes lsCp{C). 

The performance ratio of a tour C for the UPS problem is the ratio lsCp{C ) / 
opt, where opt denotes the optimal value of the UPS problem, that is, opt = 
lsCp(C''^'’®) for some optimal tour of the UPS problem. 

Throughout the paper, we consider the UPS problem under the additional 
assumption that = 1 for at least one vertex u (representing, say, a UPS 
location) . 



1.3 Results 

We start studying a restricted class of vertex probabilities for the UPS problem. 
Let G, I, and p* be the input of the UPS problem, where we assume that, for 
some 0 < p < 1, p* G {p, 1} for all v. Our first result is a lower bound on the 
objective function for this particular case in terms of the TSP optimum. 

Theorem 1. Let C be a tour and G™'’ be a TSP tour. Then Iscp-{C) > p ■ 
1{C^^^). 



The assumption that the vertex probabilities only attain the values p and 1 can 
be removed with the help of the following proposition. 

Proposition 1. Let p and q be two sets of vertex probabilities, where p domi- 
nates q. Then /sCq(G) < lsCp{C) for any tour C. 

The performance ratio of as a solution for the general UPS problem follows 
from Theorem 1, using Proposition 1: 

Corollary 1. Let G™'’ be a TSP tour, let G^™ be an optimal solution of the UPS 
problem, and let p^in ■= min„gyp^. Denote by opt the optimal value /sCp(G'''’®). 
Then 

IscpjC^n < ^ 

opt “ Pmin ' 

Our second result is that this bound is tight. 

Theorem 2. For every e > 0 there is an instance G to the UPS problem such 
that there are two TSP tours C\ and C 2 with lsCp{C 2 ) < (Pmin + e)lsCp{Ci). 
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The tightness then follows from opt < lsCp{C 2 )- 

When studying approximations for a computational problem, it is certainly 
necessary to explore the complexity theoretical limitations of that approach. We 
prove the following hardness of approximation result. 

Theorem 3. The approximation threshold of the UPS problem with any con- 
stantly hounded probability set is not less than the approximation threshold of 
the Metric TSP. 

The proof of Theorem 1 will be given in Section 2. In Section 3 we will sketch 
the proof of Proposition 1 and give more details on Corollary 1. Theorems 2 and 
3 will be proved in Sections 4 and 5, respectively. 



1.4 Conclusions and Open Problems 

The UPS problem extends the TSP in that not only the length of the tour, 
but also the lengths of the subtours determine its objective value. As one might 
expect, the tradeoff depends on the vertex probabilities. We give matching upper 
and lower bounds on the rate in Theorems 1 and 2. 

This result, being interesting in its own right, has several consequences for 
the approximation properties of the UPS problem. The currently best known 
approximation algorithm for the Metric TSP, Christofides’ algorithm [3], has a 
performance ratio of |. As a consequence of Corollary 1, the same algorithm has 
a performance ratio of 75 -^ — for the UPS problem. 

^Pmin 

Similarly, every other approximation algorithm for the Metric TSP can be 
applied to the UPS problem, while the performance ratio is multiplied by a factor 
of Thus, the approximation threshold of the UPS problem is at most ^ 7 ^, 
where 0 is the approximation threshold for the Metric TSP (for the definition 
of the approximation threshold and related notions see, e.g., Chapter 13 in [5]). 
This fact is complemented by Theorem 3, which states that it is at least 9. 

One of the first questions that one might ask in this context concerns the 
influence of different probabilities. The factor of might seem too pessimistic, 

Pmin 

if there were only few vertices with probability Pmin and lots of vertices with 
much larger probabilities. However, the tight examples given in the proof of 
Theorem 2 can be modified so as to show that the bound given in Theorem 1 is 
very accurate. 

The situation is less clear in the case of approximation algorithms. Here it 
is conceivable that an algorithm takes into account the distances given by I and 
combines them with the individual vertex probabilities in a clever way. The non- 
approximability result in Theorem 3 does not set any limit for that, however it 
is not less conceivable that the hardness result could be improved to show that 
the approximation threshold is actually 

The same consideration applies to the important special case of Euclidean 
instances. An instance of the TSP is called Euclidean if there is a point in 
the plane for every vertex such that the distance given by I is the Euclidean 
distance between the points. For this special case, there exist polynomial-time 
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approximation schemes (PTAS) [2,4] for TSP. (A PTAS consists of a polynomial- 
time algorithm for the problem with a performance ratio of at most 1 -I- e, for 
each e > 0.) Thus, for each e > 0, by Theorem 1 there exists a polynomial time 
algorithm for the UPS problem with Euclidean instances with a performance 
ratio of at most On the other hand, it is conceivable both that there is a 
PTAS and that the approximation threshold is up to 



2 The UPS Problem with Probabilities 1 or p 



The aim of this section is to prove Theorem 1. Recall that there is a m G U 
with p* = 1 and that, for that theorem, the vertex probabilities are restricted 
to values of p and 1 . 

Let be a TSP tour and C be a tour. To bound /scp. (C) in terms of 
l(C'T®P) we first need another formulation for the corresponding expectation. To 
this end we introduce some more notation. 

Let U be the set of vertices of probability 1, C7 := V \U, and let t be the 
number of vertices in U. If t = 0 then U = V and lsCp»{C) = 1{C). Since 
1{C) > the theorem clearly holds in this case. So we may assume t > 1. 

For every u,v G U, let Puv be the subsequence of C beginning at u and 
ending at v (circularly). Let be the shortcut of Puv induced by {u,v} U U. 
Note that denotes a cycle — the shortcut of C induced by {v} U U. 

Denote by vq,vi, . . . ,Vt-i the vertices of U in the order given by C. For 
i = 0, ..., t— 1, set Ci := : 0 < j < t}, where indices are taken modulo 

t, and l{Ci) := ^(-^)- Each Ci is a collection of paths (cycles if i = t — 1) 

in G whose concatenation results in an Eulerian subgraph of G. Because U 
and each vertex in U appears in some path (cycle if i = t — 1) in Ci, each of these 
Eulerian subgraphs is connected and spanning. Therefore, for each i, 

KCi) > 1{G^^^. (1) 



Using the notation above we can give the following characterization of lsCp*{G): 



Lemma 1. IsCp. (C) = (1 - pyi{scu{C)) + p{l - pf ^l{Ct-i) + P^(1 - 

p)%Ci). 



Proof. By definition, IsCp.(C) = X^c/cscv Note that the sum- 

mands where |S'\ C/| < 1 contribute with (1 — pYl{scu{C)) -l-p(l — pY~^l{Ct-i) 
to IsCp.(C'). 
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Let 5 := {S' : [/ C S' C U, [S' \ C/| > 2} be the collection of the other vertex 
subsets. For any S G S, let Es := E{scs\u{C))- Then, adding indices modulo t, 



Y^Vr[S]l{scs{C)) 

ses 



^Pr[S] ^ 

SgS {vi,Vj)GEs 

E E E 

i—O S gS 

{vi,Vj)GEs 



t-2 t-1 

E E [{vj,Vj+i+i) G Es] 

i—0 j—0 

t-2 

Y,i{c,)pHi-py 

i=0 



and the lemma holds. 

Putting (1) and Lemma 1 together, we have that 



□ 



Iscp^ (C) > p{l - py-H{C^^n + - p)%c^^^) 

i^O 

= i{c-n fp(i-p)‘-'+Ep'(i-p)*l • 



i=0 



Straightforward calculation shows that the right hand side is equal to 
Z(C'T®P)p, concluding the proof of Theorem 1. □ 



3 Arbitrary Probabilities 

Our result on the UPS problem with arbitrary probabilities, i.e. Corollary 1, 
is a consequence of Proposition 1. The proof of Proposition 1 is based on the 
FKG-Inequality [1, p. 75] and we only sketch it here. 

Let p and q be two sets of vertex probabilities and assume that p domi- 
nates q. Let C be any tour. For S C V, let /(S') := l{scs{C)) and g{S) := 
• rit,^s(l - Pv)/{^ - Qv)- Observe that S Prq [S] is log-super- 
modular, that / is increasing because of the triangle inequality, and that g is 
increasing because p dominates q. Thus, the requirements of the FKG inequality 
are met and we have 

^ /(S)Prq [S] • ^ g{S)Pr^ [S] < ^ f{S)g{S)Pr^ [S] • ^ Pr^ [S] . 
scv scv scv scv 

Since Prp [S] = Prq [S] • g{S), this implies Proposition 1. □ 
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Corollary 1 follows by sandwiching p between two appropriate sets of vertex 
probabilities. More precisely, let 



Pv ■= 




if Pv < 1, 
otherwise. 



Then p* is dominated by p, which in turn is dominated by the all-ones proba- 
bility set. The corollary follows from 



opt = ZsCp(C"“’"®) > > Pmin > Pmin ZsCp , 

where the first and the last inequality are implied by Proposition 1 and the 
second inequality by Theorem 1, applied to C = 



4 Tight Examples 

Assume that p < 1 and let e > 0. In this section we give the construction of an 
instance A of the UPS problem with probabilities p and 1 . It has two TSP tours 
Cl and C 2 and (3) states that their UPS values differ by a factor of at least 
This implies Theorem 2. 

We assume w.l.o.g. that e < 1 — p. Let fc be a positive integer, large enough 
so that k + logi_p(8fc^) > logi_p e. Let n be a prime such that < n < 4fc^. 
Then 

2n(l-p)'= < 8fc2(l-p)'= < e. (2) 

Let V := {0, . . . , n — 1} and let H := (U, E), where 

E := {ij : j — i (mod n) < k}. 

That is, H = is a, cycle on n vertices plus all chords of length at most k. Let 
l{e) = 1 for all e G E. Then two TSP tours for H and I are (indices are taken 
modulo n, as usual) 



Cl := {0,k, . . . ,ik, . . . , nk) and 
C 2 := (0,l,...,i,...,n). 

Note that Ci is a tour because of the primality of n. 

Let po := 1 and pi := p for t > 1. Then C consists of (the closure of) E[, I, 
and p. In the rest of this section we shall prove that 

/sCp(C 2 ) < {p + e)lsCp(Ci). (3) 

Let S' be a randomly chosen subset of V. Call S dense for Ci if S intersects 
any set of k consecutive vertices of Ci. The probability of that event is at most 
n(l — p)*, where n(l — p)^ < e/2 by (2). The event that S is dense for C 2 is 
defined analogously, and its probability is the same. 

Assume that S is dense for Ci and let ik and jk > ik be two subsequent 
vertices of sc 5 (Ci). Then j — i < k because S is dense for Ci. But then jk — ik < 
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< n/2, and by the choice of H the distance between ik and jk is j — i. Assume 
that scs(C'i) = {ilk , . . . , Z|s|fc) and let io := t| 5 |. Then 

|S| |S| 

n = {ij — ij-i (mod n)) = l{ijk, ij-ik) = l{scs{Ci)), 
j=i i=i 



and therefore 



lsCp{Ci) > nPr [S' is dense for Ci] > (1 — el2)n. (4) 

If S is dense for C 2 , then there is a chord between any two subsequent vertices 
of scs(C 2 ) and thus l{scs{C 2 )) = |S|. This implies that 

lsCp{C 2 ) < nPr [S is not dense for C 2 ] + ^ jSjPr [S] < {p+ el2)n. (5) 

scv 



As a consequence of (4) and (5), 

lsCp{C 2 ) ^ p+ e/2 
lsCp{Ci) ~ 1 — e/2 ’ 

which implies (3), using e < 1 — p. □ 

5 Hardness of Approximation 

The Metric TSP is APX-complete [7] and the currently best lower bound on the 
approximation threshold is in the asymmetric case and ^ in the symmetric 
case [6] . It is trivial that the UPS problem has the same lower bounds, because 
the objective functions coincide for the all-ones probability set p„ = 1. It might, 
however, be interesting to verify that the same holds for the probability set 
Py = p, where 0 < p < 1. This, together with Proposition 1, proves Theorem 3. 

Next we present a reduction from Metric TSP to the UPS problem in in- 
stances with probability set p„ = p for each e > 0. The reduction preserves the 
approximation ratio up to a factor of 1 — e. 

Let U be a vertex set and let U — > i?"*" be an instance of the Metric TSP 

on V. The corresponding instance of the UPS problem consists of the following. 
Add Cp^e(n-) copies of every vertex to get V and let /' be the extension of I to 
V such that all copies of the same vertex have distance 0 to each other and 
copies of different vertices have the same distance as the original vertices. Here 
Cp,e(n) = 0(log^ n) is chosen large enough that (1 — (1 — p)°p.d")j" > 1 _ g. 
Note that this can be done in polynomial time. 

Any tour on V whose performance ratio (for the UPS instance) is at most r\ 
can be converted into a tour on V whose performance ratio (for the original TSP 
instance) is at most p/(l — e). Indeed, let C" be a tour on V whose performance 
ratio is at most r\ for the UPS problem. We may assume w.l.o.g. that all copies of 
each original vertex occur subsequently in C . Then l{scs{C')) = 1{C) as long as 
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S contains at least one copy of each original vertex. Since the probability for that 
event is at least 1 — e, we have that IsCp(C') > — = {l — e)l{C), where C 

is the ordering of V induced by C' . Now let be a TSP tour on V. If we extend 
it to a tour on V' by visiting all copies of every vertex subsequently, we know 
that Therefore the optimal UPS solution has 

length at most Thus (1 — f)l{C) < lsCp{C) < rjlsCp{C^^^) < r]l{C'^^^). 

This completes the analysis of the reduction, and, together with Proposition 1, 
the proof of Theorem 3. □ 
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Abstract. We consider a collection of robots which are identical (anony- 
mous), have limited visibility of the environment, and no memory of the 
past (oblivious); furthermore, they are totally asynchronous in their ac- 
tions, computations, and movements. We show that, even in such a to- 
tally asynchronous setting, it is possible for the robots to gather in the 
same location in hnite time, provided they have a compass. 

Keywords: Distributed algorithms, coordination, control, mobile 
robots. 



1 Introduction 

In current robotics research, both from engineering and behavioral viewpoints, 
the trend has been to move away from the design and deployment of few, rather 
complex, usually expensive, application-specific robots. Instead, the interest has 
shifted towards the design and use of a large number of “generic” robots which 
are very simple, with very limited capabilities and, thus, relatively inexpensive. 

In particular, each robot is only capable of sensing its immediate surrounding, 
performing computations on the sensed data, and moving towards the computed 
destination; its behavior is an (endless) cycle of sensing, computing, moving and 
being inactive (e.g., see [2,7,8,9]). On the other hand, the robots should be able, 
together, of performing rather complex tasks. Examples of typical basic tasks 
are gathering, leader election, pattern formation, scattering, etc. 

A very important set of questions refer to determining the robots capabilities; 
that is how “simple” the robots can be to perform the required task [3]. In 
computational terms, this question is to identify the factors which influence 
solvability of a given problem (the task). 

These questions have been extensively studied both experimentally and the- 
oretically in the unlimited visibility setting, that is assuming that the robots are 
capable to sense (“see”) the entire space (e.g., see [4,6,10,12]). In general and 
more realistically, robots can sense only a surrounding with a radius of bounded 
size. This setting, called the limited visibility case, is understandably more diffi- 
cult, and only few algorithmic results are known [1,11]. 

In this paper we are interested in gathering: the basic task of having the 
robots meet in a same location (the choice of the location is arbitrary). Since 
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the robots are modeled as points in the plane, the task of robots gathering is 
also called the point formation problem. Gathering (or point formation) has been 
investigated both experimentally and theoretically. In particular, in the limited 
visibility setting, Ando et al. [1] presented a gathering algorithm for indistin- 
guishable robots which are placed on a plane without any common coordinate 
system; their algorithm does not require the robots to remember observations nor 
computations performed in the previous steps. Their result implies that gather- 
ing can be performed with limited visibility by very simple robots: anonymous, 
oblivious and disoriented. 

Their solution, however, is based on a very strong “atemporal” assumption 
on the duration of the robots’ actions: their robots must be capable in every 
cycle to perform all the sensing, computing and moving instantaneously. 

This assumption has many consequences crucial for its correctness. For exam- 
ple, since movement is instantaneous, a robot can not be seen by the others while 
moving (and its temporary position mistaken for a destination location); since 
sensing and computing is instantaneous, a robot always has available the correct 
current situation of its neighborhood. Note that, since instantaneous movement 
is not physically realizable, their solution is only of theoretical interest. 

In this paper, we study the gathering problem in the most general case of an 
asynchronous system of robots with limited visibility, where both their computa- 
tions and their movement requires a finite but otherwise unpredictable amount 
of time. The question motivating our investigation is whether point formation is 
possible in such a system. Since in these systems gathering is unsolvable if the 
robots are disoriented (i.e., have no common system of coordinates), we shall 
restrict ourselves to systems with sense of direction (i.e., the robots share the 
same coordinate system). 

In this paper we show that indeed anonymous oblivious robots with limited 
visibility can gather within a finite number of moves even if they are fully asyn- 
chronous. In fact, we describe a new algorithm for solving the point formation 
problem in the asynchronous setting by anonymous oblivious robots with limited 
visibility. We then prove its correctness showing that the robots will gather in 
a point within a finite amount of time. This result holds not only allowing each 
activity and inactivity of the robots to be totally unpredictable (but finite) in 
duration, but also making their movement towards a destination unpredictable 
in length (but not infinitesimally small). In other words, we show that gathering 
can be performed by simpler robots with fewer restrictions than known before, 
provided they have a common coordinate system. 

From a theoretical point of view, this result proves that, with respect to the 
gathering problem, ’’sense of direction” has the same computational power as 
’’instantaneous actions”. From a practical point of view, this result has funda- 
mental consequences. In fact, it allows to substitute a theoretically interesting 
but physically unrealizable motorial and computing capability requirement (in- 
stantaneous actions) with a property (sense of direction) which is both simple 
and inexpensive to provide (e.g., by a compass). 
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The paper is organized as follows. In Section 2 the model under study is 
formally presented. In Section 3 the notations used in the paper and some useful 
geometric lemmas are introduced. The gathering algorithm is described in Sec- 
tion 4, and in Section 5 its correctness is proven. Due to space limitations, some 
of the proofs are omitted and can be found in [5] . 

2 The Model 

We consider a system of autonomous mobile robots. Each robot is capable of 
sensing its immediate surrounding, performing computations on the sensed data, 
and moving towards the computed destination; its behavior is an (endless) cycle 
of sensing, computing, moving and being inactive. 

The robots are modeled as units with computational capabilities, which are 
able to freely move in the plane. They are viewed as points, and are equipped 
with sensors that let each robot observe the positions of the others with respect 
to its local coordinate system. Each robot can see only a portion of the plane; 
more precisely, it can observe whatever is at most at a fixed distance V from it 
{limited visibility). 

Each robot has its own local view of the world. This view includes a local 
Cartesian coordinate system with origin, unit of length, and the directions of two 
coordinate axes, together with their orientations, identified as the positive and 
negative sides of the axes. In this paper we assume that the robots share the same 
coordinate system {sense of direction)-, however, they do not necessarily agree 
on the location of the origin (that we can assume, without loss of generality, to 
be placed in the view of a robot in its own current position), nor on the unit 
distance. 

The robots are oblivious, meaning that they do not remember any previous 
observation nor computations performed in the previous steps. The robots are 
anonymous, meaning that they are a priori indistinguishable by their appear- 
ances, and they do not have any kind of identifiers that can be used during the 
computation. Moreover, there are no explicit direct means of communication: 
the communication occurs in a totally implicit manner. Specifically, it happens 
by means of observing the change of its fellows’ positions in the plane while they 
execute the algorithm. 

Summarizing, the robots are oblivious, anonymous, and with limited visibil- 
ity; they do however have a common coordinate system. 

They execute the same deterministic algorithm, which takes as input the 
observed positions of the robots within the visibility radius, and returns a des- 
tination point towards which the executing robot moves. A robot is initially in 
a waiting state ( Wait)-, at any point in time, asynchronously and independently 
from the other robots, it observes the environment in its area of visibility {Look), 
it calculates its destination point based only on the current locations of the ob- 
served robots {Compute), it then moves towards that point {Move) and goes 
back to a waiting state. The sequence: Wait (W) - Look (L) - Compute (C) - 
Move (M) will be called a computation cycle of a robot. 
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The robots are fully asynchronous. In particular, the amount of time spent 
in a computation, in a movement, and in inactivity is finite but otherwise un- 
predictable. Moreover, a robot moving towards the computed destination can 
stop after an unpredictable amount of space, provided is neither infinite, nor 
infinitesimally small (unless it reaches its destination). More precisely, the only 
assumptions made are the following: 

Assumption Al. Any robot will complete its cycle in an amount of time which 
is finite and bounded from below. 

Assumption A2. The distance traveled by a robot in a move is finite and 
bounded from below (unless the destination is closer than the bound) . 

As a consequence, the (global) time that passes between two successive move- 
ments of the same robot is finite; furthermore, while a robot is moving, it can 
be seen an unpredictable but finite number of times by another robot. 

3 Notations and Geometric Lemmas 

We first define sets related to which state a robot is at a given time during the 
computation. 

W{t) and L{t) are the set of all the robots that are respectively in state W and 
L at time t. 

C{t) = C^{t) U C+{t) is the set of all the robots that at time t are computing. 
The set Cg contains those robots whose computation’s result is to stay still 
(we say that they execute a null movement), while C+ contains those robots 
whose computation’s result is some destination point (we say that they will 
execute a real movement). 

M{t) = M(h{t) U M+{t) is the set of all the robots that at time t are executing 
a movement. The set M(h{t) contains the robots executing a null movement 
(they stay still); M+{t) contains those executing a real movement (they are 
effectively moving towards a destination) . 

We define circle of visibility Ci{t) of a robot Vi at time t the circle of radius V 
centered in r^, if G L{t). Otherwise Ci{t) = Ci{t'), where t' = max{t|ri G L{t)}. 

In other words, if a robot is Observing, its circle of visibility is the circle of 
radius V centered in itself; otherwise, it is the circle of radius V centered in the 
location of its most recent Look phase. Where no ambiguity arises, the parameter 
t in Ci{f) will be omitted. 

We now introduce some notations and geometrical lemmas which will be 
needed later. Let A and B be two points; with AB we will indicate the segment 
starting in A and terminating in B. When no ambiguity arises we will also use 
the notation AB to denote the length of such a segment. Let A and B be two 
points on a circle; with arc(AB) we indicate the smallest arc on the circle passing 
through A and B. r indicates a generic robot in the system (when no ambiguity 
arises, r is used also to represent the point in the plane occupied by robot r); 
capital italic letters indicate regions (e.g. £, TZ); given a region, we denote by | • | 
the number of robots in that region. 
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Lemma 1. Every internal chord of a general triangle has length less or equal 
to the longest side of the triangle. 



Lemma 2. Let Q be a convex quadrilateral. If all the sides and the two internal 
diagonals have length less or equal to V then every internal chord of Q is less 
or equal to V. 



Lemma 3. Let OB be the radius of a circle centered in O and D be a point on 
the circle such that BOD = f3, with 0 < f3 < 90°. ThenpC < BC, Vp € arc{BD) 
and VC G OD. (see figure l.b) 

4 The Algorithm 

Let us call Universe (U) the smallest isothetic rectangle containing the initial 
configuration of the robots and let us call Right and Bottom respectively, the 
rightmost and the bottom most side of U. 

The idea of the algorithm is to make the robots move either towards the 
bottom or towards the right of the Universe (a robot will never move up or to 
its left), in such a way that, after a finite number of steps, they will gather at 
the bottom most lower most corner of the Universe. 

A robot r can move only if it does not see any robot neither to its left 
nor above on its vertical axis. Several situations could arise depending on the 
positions of the robots in its area of visibility: 

— If r does not see any robot, it does not move; 

— If r sees robots only below on its vertical axis, it moves down towards the 
nearest robot; 

— If r sees robots only to its right, it moves horizontally towards the vertical 
axis of the nearest robot 

— If r sees robots both below on its axis and on its right, it computes a desti- 
nation point and performs a diagonal move towards the right. 

Recall that Ci is the circle of visibility of robot r*. Let A A' be the vertical 
diameter of such region; let TZi and Ci denote the regions to the right and to the 
left of ri, respectively (see Figure 1). Let Sp = riA! and So = gA. 

Algorithm 1 (Gathering) . 

Extrem := (|£i| = 0 A jS),! = 0); 

If I am ^Extrem Then 
Do_nothing() ; 

Else 

If (|7^^| = 0 A IS'ol = 0) Then 
Do_nothing() ; 

If |7^^| = 0 Then 

rj := nearest visible robot on So', 
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Fig. 1. (a) The Notation Used in Algorithm 1; (b) Lemma 3; (c) Lemma 6. 



Move (vj) . 

If (|7^^| 7 ^ OA IS'ol = 0) Then 
li := Nearest () ; 

Hi := HJDestination(A) ; 

Move(iJi) . 

If |7^^| 7 ^ 0 Then 
li := Nearest () ; 

Diagonal_Movement Hi) . 

Nearest 0 returns the vertical axis on which the robot in TZi with the nearest 
axis to Ti lies. 

H_Destination(A) returns the intersection between A and a line parallel to 
the X direction and passing through 

Move(p) terminates the local computation of the calling robot and moves it 
towards p. 

In the last case of the Algorithm 1, sees somebody below it and somebody 
to its right, therefore, to avoid losing some robots, it has to move diagonally, as 
indicated by the following routine. 

Algorithm 2 (DiagonalJIovement HO ) . 

1: B := upper intersection between Ci and If, 

2: A := point on So at distance V from me; 

3: 2/3 = AnB] 

4: If /3 < 60° Then 
5: B := Rotate (r^, i?) . 

6: iJi := DJDestinationCU, A, A, B) ; 

7: Move(iJi). 

Rot at e(ri,B) rotates the segment ViB in such a way that (3 = 60° and 
returns the new position of B. 
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With DJDestinationC V, Ji, A, _B) , computes its destination in the follow- 

ing way: the direction of its movement is given by the perpendicular to the seg- 
ment AB] Hi = min{ V , the distance of li according the direction of movement}. 

5 Correctness 

In this section we will prove the correctness of the algorithm by first showing 
that the robots which are mutually visible at any point of the computation, will 
stay mutually visible until the end of the computation, and concluding that at 
the end of the computation all robots will gather in one point. We first introduce 
some lemmas. From Assumptions A1 and A2 it directly follows that: 

Lemma 4. Let Vi and Vj be two generic robots and let t and t' > t two moment 
of the computation. If ri G L{t), rt G L{t'), rj G M{t), rj G M{t'), rj G Gift) 
and rj G Gift'), then rj can not be in the same point in t and t' . 

Moreover, from the Gathering algorithm it follows that: 

Lemma 5. Let rj and ri two arbitrary robots, with ri to the right of rj at time 
t. If rj G L(t) and fjri < V, then rj can not pass ri in one step. 

Let us consider a generic robot ri executing the algorithm. Let [3 be the 
angle between the vertical axis of ri and the direction of its movement {AfiHi 
in Figure l.c). 

Lemma 6. The segment riHi is always smaller or equal to V. Moreover, BHi = 
AHi = V and pHi < R, V p G r^A. 

Thus, 0{A,ri, B, Hi) is a parallelogram. We now introduce the definition of 
visibility graph. The visibility graph G = {N, E) of the robots is a graph whose 
node set N is the set of the input robots and, Vri,rj G N, (ri,rj) G A iff rj and 
rj are initially at distance smaller than the visibility radius V. We first show that 
the visibility graph must be connected in order for the algorithm to be correct. 

Lemma 7. If the visibility graph G is disconnected, the problem is unsolvable. 

Thus, in the following we will always assume that G is connected. 

5.1 Preserved Visibility 

In this section we prove that the visibility graph is preserved during the entire 
execution of the algorithm. We prove so by introducing the notion of mutual 
visibility and by showing that the robots which are connected in the visibility 
graph (i.e., those which are initially within distance V) will eventually become 
mutually visible, and that two robots that are mutually visible at some point in 
the algorithm will stay mutually visible until the end of the computation. 

Informally speaking, we say that two robots are mutually visible if each 
robot includes the other one in its computation, namely each of them had seen 
the other one during its observation phase. Formally, two robots ri and r 2 are 
mutually visible at time t iff 
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- ri G (L{t) U Cii{t) U M(h{t)) A r 2 G Ci{t) A r 2 G {W{t) U L{t)), or 

- ?’2 G {L{t) U C'0(t) U A ri G C2(i) A ri G lw{t) U L(i)). 

Since all the robots at the beginning are in W, from the above definition we 
have that the robots that at the beginning are within distance V will become 
mutually visible in finite time. That is, the following lemma holds: 

Lemma 8. Let rt and rj be two robots that at the beginning are within distance 
V. Robots ri and rj will become mutually visible in a finite number of steps. 

We now introduce a couple of lemmas which will be useful to prove that 
mutually visible robots will stay so until the end of the algorithm. Let be a 
generic robot on an axis S. Let S' and S" be two vertical axes to the right of S. 
We will denote by SS' and SS" the distances between the corresponding axis. 
Then we have: 

Lemma 9. SS' < SS" <tA Ps' > Ps", where Ps> and Ps" are respectively the 
angles computed by the routines Diagonal_Movement (S") and Diagonal_Move- 
mentCS"') (Figure 2. a). 



S S' S" 



h 



h 




(a) 



(b) 



(c) 



K 

Hi 

M 



Fig. 2. (a) Lemma 9; (b) and (c) Lemma 10. 



Lemma 10. Let us consider the situation depicted in Figure 2.b, where F is a 
point at distance < V from ri on its axis (with F ri), Hi is the destination 
point of ri. Letps be a segment in A{F, M, K), with s to the right of p, and s' 
the projection of s over riHi. Then we have W <V, \f I G ps, V I' G s' Hi. 

We are now ready to show that, as soon as two robots becomes mutually 
visible, they will stay mutually visible. We first prove that this property holds 
when two mutually visible robots lie on the same vertical axis; and then we prove 
that it holds for two robots lying on different vertical axes. In the next lemma 
we will refer to the notation introduced in Figure l.a and Lemma 10. 
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Lemma 11. Let rt and rj be robots which are mutually visible at time t; more- 
over, let they lie, at time t, on the same vertical axis with rj being below ri. 
There is a time t' > t when r^ and rj are mutually visible. Moreover, between t 
and t' ryrj < V. 

Proof. Let us first consider the case when TZi is empty. In such a case, r* would 
clearly move towards rj (shortening their distance), while rj would not move. 
Since by Algorithm 1 can not pass rj, the first time r* stops while it is moving 
towards rj the mutual visibility definition holds, and the lemma follows. 

Let us now consider the more interesting case when TZi is not empty. In the 
following we shall consider several situations: 

Case i: rj does not look until ri reaches its destination Hi. We have that ri GW 
while ri is moving towards Hi. Since AHi = V (Lemma 6) and < V 
(Lemma 6), we have that, by Lemma 1 on AijiAHi), the distance between 
ri and rj is always < V while is moving. Therefore, the first time stops 
along its path (at most on Hi), the mutual visibility definition applies and 
the lemma follows. 

Case ii: rj looks while ri is moving towards its destination Hi. Since r* is on 
r^’s right, rj can not perform a Vertical Move. Hence, rj can either decide 
not to move (because it sees some robots above ) or to move. In the first 
case the proof reduces to the one of Case i. On the other hand, rj can decide 
to move after having looked. From Case i we know that rj can see ri on its 
right. Moreover, it might also see some other robots below it, that can be 
either on the same axis {rj perform a Diagonal Move) or not {rj performs 
an Horizontal Move). The following applies to both situations (Figure 3). 



s = ifi]...ir Ii 




Fig. 3. Case ii of Lemma 11. 



Let us call the axis, counting from S, from where rj looks while ri 
is still on its way towards Hi, and Pw the points on this axis from where 
rj performs the look phases. Clearly 1° = S and F = po coincides with the 
position of rj on S. In the following we will prove by induction that 
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a. IJ’ is to the left of , 

b. The destination point dw+i that Vj computes when it is on is inside 
A{F,K,M), 

c. pw+iTi < V, and is to the left of r^. 

Basis. Let di be the first destination point rj computes. Since is on its 
right, rj can only decide to perform a Diagonal Movement, therefore di 
must be to the right of J°, and as a consequence 1° is to the left of ij. 
Moreover, by Lemma 9 we know that Vjdi must lie above VjM, hence pi 
(that is on rjdi) must be within A{F, K, M). Finally, rj can see by 
hypothesis and at the beginning 7° is to the left of r*, and the basis of 
the induction follows. 

Inductive Step. Let us assume that all the statements are true for 1, . . . ,w. 
Since by inductive hypothesis I™ is to the left of rt and rj can see r* 
from Tj can only decide to perform a Diagonal Movement, therefore 
dw+i must be to the right of and can not be after n (because of how 
DiagonalJfovement (•) works), and, as a consequence, Ij" is to the left 
of 7“^^, and a. follows. 

Moreover, since Ifh < SU and , by Lemma 9, we have that d^Pa+i 
must be above FM but cannot be above FK (because the algorithm 
does not allow ”up” movements). Therefore the point b. follows. 
Furthermore, since b. holds and can not be after dw+\, by Lemma 
10 c. follows, and the induction is proved. 

Now we know that all the stop rj does while is moving towards Fli are 
inside A{F, K, M), hence, by Lemma 10, within distance V from r^. Thus 
we have that, when reaches Fli, it can see rj on its left, therefore, it can 
not move further. It follows that, until rj is before it, ri can be only in L(-), 
or Therefore, the first time that rj stops after n reached Fli, 

say at time t' > t, ri and rj will be mutual visible. Moreover, between t and 
t', by Lemma 10 fiTJ < V, and the lemma follows. □ 

In the following lemma we show that if a robot sees some robots on its right, 
then it will never lose them during the computations. Let be a robot in the 
system, R be the set of robots which are mutually visible with r* at time t and 
that are located to the right of R, and a robot in R (Figure 4). Moreover, 
let B and C be respectively the upper and lower intersection between 7^ and Ci, 
and 77' be the intersection between Ci and the line passing through riHi. 

Lemma 12. There exists a time t' > t after which ri will be always mutually 
visible with the robots in R. Moreover, rir* < V , \f r* G R. 

Proof. From Algorithm 1, we know that robots in R cannot perform any move- 
ment while ri is on their left. Let t* the time when ri enters its Look phase and 
p be the destination point it computes. Clearly, p can not be to the right of any 
robot in R. In the following, we first prove that Ir* <V ,y r* G R and VI € ryp. 

From Lemma 3, it follows that: Vp G arc{BHl),pHi < BHi = V (1). 
Moreover, HiC = BC — BHi < 2C — V = V and from Lemma 2 we have: 
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Vp G arc{H[C),pHi < HtC < V (2). Plugging (1) and (2) we obtain: Vp G 
arc{BC),pHi <V (3). 

Let us now consider a robot G sector(BCB) (that is in the area to the 
right of li and in Ci) and let s' be the intersection between arc(BC) and the 
line passing through Hi and We have that HiVk < Hts' < V (from (3)), 
rirt < V, and riHi < V. Therefore, applying Lemma 1 to A{ri,rk, Hi) we have 
that qfk <V,\/qG ViHi. In conclusion, when Vi stops in p, say at time t' > t, it 
will see all the robots in R, that can only be in L{t'), or and the 

lemma follows. □ 

By Lemma 8, 11 and 12 we can conclude that: 

Theorem 1. The visibility graph G is preserved during the execution of the 
algorithm. 

5.2 Finiteness 

In this section we will prove that, after a finite number of steps, the robots will 
gather in a point. 

Lemma 13. Let us suppose to have several robots on a vertical axis A and no 
robots to the left of A. If r is the topmost robot on A that can see a robot to the 
right of A, then, in a finite number of steps, either all the robots above r on A 
will reach r, or one of them will leave A. 

The next two lemmas show that all the robots in the system converge to the 
Right axis of the Universe, and actually reach it. 

Lemma 14. For any given vertical axis I before Right which is at any distance 
d > 0 from it, all the robots that are on the left of I at the beginning of the 
algorithm, will pass I in a finite number of steps. 
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Lemma 15. After a finite number of steps, all the robots in the system reach 
Right. 

The following lemma states what happens when all the robots lie on the 
same vertical axis: they will reach the bottom most robot on that axis in a finite 
number of steps. 

Lemma 16. If all the robots of the system lie on the same vertical axis A, then 
in a finite number of steps all the robots will reach the bottom most robot on A. 

We can finally conclude that: 

Theorem 2. In a finite number of steps, all the robots in the system gather in 
a point; the rightmost and bottom most corner of the universe. 
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Abstract. Langton’s ant is a simple discrete dynamical system, with 
a surprisingly complex behavior. We study its extension to general pla- 
nar graphs. First we give some relations between characteristics of finite 
graphs and the dynamics of the ant on them. Then we consider the infi- 
nite bi-regular graphs of degrees 3 and 4, where we prove the universality 
of the system, and in the particular cases of the square and the hexag- 
onal grids, we associate a P-hard problem to the dynamics. Finally, we 
show strong spatial restrictions on the trajectory of the ant in infinite 
bi-regular graphs with degrees strictly greater than 4, which contrasts 
with the high unpredictability on the graphs of lower degrees. 

1 Introduction 

The virtual ant defined by Ghris Langton ([1], [2]) is a simple system where an 
agent, the “ant”, moves on the square grid. Each cell is in one of two states, 
to-left or to-right, and the ant is represented as an arrow between two adjacent 
cells. It moves one cell forward at each time step, turning according to the state 
of the cells, and switching these states thereafter. Interesting behavior follows: a 
single ant, starting with all cells in the to-left state, has a more or less symmetric 
trajectory in the first 500 steps; then it goes seemingly randomly for about 10,000 
steps, until it suddenly starts building an infinite diagonal “highway” (a periodic 
motion with drift). 

As [3] points out, the ant is so “natural” that it has been independently in- 
vented at least three times. Langton proposed it as a simple model of artificial 
life [1], and it appeared again as one of the “turmites”, the two-dimensional 
Turing machines studied by G. Turk [4]. It has also been studied as a paradigm 
for signal propagation in random media, in particular as a model of a parti- 
cle in two-dimensional Lorentz Lattice Gases [5]. Another source of interest is 
the relation with the agent-based systems (also called “ant systems” ) that have 
been intensively studied and applied for several optimization problems in the last 
years, with good performance but few exact mathematical results. Langton’s ant 
shares with them the so called “stigmergy”: the movement of the agent is de- 
termined by some properties of the environment next to it, and these properties 
are in turn modified by that movement. 
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@ Springer- Verlag Berlin Heidelberg 2001 




260 Anahf Gajardo, Eric Goles, and Andres Moreira 



The ant has motivated several studies, both experimental and analytical. It 
has been analyzed in the other regular grids, like the triangular ([6], [5], [7]) and 
the hexagonal grids [7] . The case of bi-regular graphs of degree 3 was studied in 
[8], and some possible definitions for the ant on the line where examined in [9] 
and [10]. There have been some generalizations of the ant, to allow more than 
two states of the cell and to consider several ants. 

The dynamics of the ant is strongly related to the topology of the underlying 
graph. On the triangular grid, the trajectory is always restricted to two rows, and 
is easily predicted [6]. The hexagonal grid is again different: when starting with 
all cells in the same state, the ant follows paths that are bilaterally symmetric 
with respect to the starting position, and no highway appears [8] . 

The most important result concerning the dynamics of the ant on the square 
grid, due to [3], states that the set of the cells that are visited infinitely often by 
the ant (for a given initial configuration) has no corners. A corner of a set is a cell 
where at least two neighbors are not in the set, and these are not opposite to each 
other. The main consequence of this is the following fact (already demonstrated 
in [11]): For any initial configuration, the trajectory of the ant is unbounded. 
These unboundness is also true on the triangular grid [5]. On the other hand, 
bounded trajectories are known to exist on the hexagonal grid [7]. 

Unfortunately, this result does not tell us anything else about the behavior of 
the ant in the long term. The experiments, however, suggest that the long-term 
behavior of the ant, although unbounded, is unbounded in a highly repetitive 
way. Specifically, the following conjecture has been open for at least ten years: 
“For any initial configuration with finite support, the ant eventually starts build- 
ing the periodic highway, in some unobstructed direction” . If this conjecture is 
true, then any problem associated with the ant, whose input is an initial con- 
figuration with finite support, turns out to be decidable, since in that case it 
suffices to iterate on the configuration until the highway appears; the question 
may be answered at that point, since the future dynamics is easily predicted. 



The Present Work 

We consider the natural extension of the ant to general planar graphs, where 
the nodes in the graph take the place of the cells in the square grid, and the 
neighbors of a node are the nodes to which it is connected. We generalize the 
rule in the most obvious way: the states at the nodes are still to-left and to-right, 
and the ant changes these states each time it goes through a node. Furthermore, 
the ant turns to the indicated direction at each time step; for this purpose, 
“turning to the left” is defined as leaving the node through the edge which is 
found moving clockwise, starting from the edge which was used by the ant to 
arrive. The square grid becomes a particular case of regular planar graph. 

In Section 2 we study the ant on finite graphs. In a restricted family (graphs 
where no edge belongs to more than one simple cycle) the period of the system 
is linearly bounded in the number of nodes, but in the general case, we show a 
family where the periods grow exponentially with the number of nodes. 
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We consider next the case of the infinite bi-regular graphs r{k,d): these 
are graphs where all the nodes have d neighbors, and all the faces (the smallest 
cycles) have k neighboring faces. They generalize the original system (the square 
grid corresponds to 1^(4, 4)), and were chosen as an intermediate point between 
it and general infinite graphs. They allow us to study the dependence of the 
dynamics of the ant on the degree of the graph and the length of the faces. 

In Section 3, we show how to calculate boolean circuits with the trajectory 
of the ant. ^ The construction is embedded in any infinite bi-regular graph of 
degree 3 or 4, and since it is finite, it can be also embedded in finite graphs. In 
the particular cases of the square grid, the hexagonal grid and finite graphs, the 
construction uses an appropriately bounded amount of space in the configura- 
tion, and the following questions are thus found to be P-hard problems: Given a 
finite initial configuration^ of P(4,4) (P(6,3)) and two nodes a, /3, does the ant 
visit a before (31 Given a finite graph, an initial configuration, and two nodes a 
and (3, does the ant visit a before /3? 

The construction of circuits has further and important consequences, which 
are presented in Section 3.3. First, the ant can draw the space-time diagram 
of any one-dimensional cellular automata (for finite configurations). It follows 
that the system is universal, since it may simulate a universal Turing machine. 
Finally, there are undecidable problems related to the dynamics of the ant. 

In Section 4 we consider the case of infinite bi-regular graphs which have de- 
gree strictly greater than 4. In spite of the higher connectivity of these graphs, 
the system seems to be less complex on them. The trajectory of the ant is re- 
stricted to a low connected sub graph, a fractal tree of faces, and the construction 
of circuits of Section 3 cannot be carried over. The restrictions do not depend 
on the exact degree, but only on the lower bound (5), provided that the lengths 
of the faces are constant. The conjecture stated in the previous section is proved 
to be true on these graphs: for any finite initial configuration, the ant falls in a 
periodic motion with drift. 



1.1 Definitions 

A non-directed simple graph (the only kind we will use) is a pair G=(U,E), where 
U is the set of nodes, and if is a set of edges of the form {u, v},u^ v G U. A path 
D is a list of nodes of the form (ug, ui, , Uk), such that Vi {ui, Ui+i} G E. The 
length of T> is the integer k. A cycle is a path whose extreme nodes coincide. A 
path (a cycle) is simple if it does not repeat nodes (other than the extreme nodes, 
in the case of cycles). Two cycles are tangent if they have a unique common node. 
The distance between two nodes is the length of the shortest path connecting 
them (if there is no such path, it is infinite) . The diameter of the graph is the 

^ As far as we know, the method used to calculate boolean circuits presented here 
is original, and differs completely from the classical methods introduced by [12] and 
[13] for two-dimensional systems. 

^ In infinite graphs we say that a configuration of the system is finite if all but a finite 
number of nodes are in the same state. 
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maximum distance between its nodes. A graph is connected if there are paths 
connecting any two nodes. A tree is a connected graph which has no cycles. An 
isthmus is an edge whose removal disconnects the graph. The neighbors of a node 
u are the nodes N{u) = {v G U : {u, n} G E}. The degree of u is |A^('u)|. A graph 
is k- regular if all the nodes have degree k. A leaf is a node with degree 1. 

A graph is planar if it may be injected in the nodes being represented 
by points and the edges by simple curves, so that the curves do not intersect. A 
graph is locally finite if any sphere in IR^ contains a finite number of nodes. In a 
planar graph, a face is one of the regions of the partition induced by the graph. 
The dual graph of G, G' , is defined as the graph G' = ([/', E'), where U' is the 
set of faces of G, and {i' ,f} G E' iff i' and j' have a common edge. 

We will be interested both in general finite graphs, and in some regular 
infinite graphs. The bi-regular graph E(k, d) is the locally finite planar d-regular 
graph whose dual is /c-regular. E{k, d) is finite for fc = 3 and d < 6, for A: G 4, 5 
and d < 4, and for fc = 6 and d < 3. T(6, 3), T(4,4) and T(3, 6) can be embedded 
in Ed? with edges of constant lengths, and correspond to the hexagonal, square 
and triangular grids, respectively. The rest of the cases corresponds to the so 
called “hyperbolic graphs”, that can be embedded in the hyperbolic plane. 

A decision problem is one where the solution, for a given instance, is yes or 
no. It is said to be decidable if there is an algorithm which answers the question 
in a finite time. Decidable problems are classified in complexity classes, which 
describe the amount of work needed to solve them. An important class is P: 
problems where the answer can be found in polynomial time. A problem to which 
any problem in P may be reduced (with the reduction satisfying logarithmic 
conditions: see [14], p.l60), is called P-hard; if it also belongs to P, is called 
P-complete. Thus, to show that a problem is P-hard, it is enough to reduce a 
P-complete problem to it. 

We say that a system is universal if it may simulate a universal Turing 
machine. This notion of universality implies, in particular, the existence of un- 
decidable problems. The complexity and undecidability of problems associated 
to a dynamical system, as well as the existence of some kind of universality in 
it, are ways to measure its complexity. For Complexity Theory, see [14]. 



1.2 Some Basic Facts About the Ant 

We consider a connected, simple, planar, non-directed graph G = (U,E). Pla- 
narity provides an order of the edges inciding a node u, and the rule of Langton’s 
ant is naturally extended in the way already explained in the introduction. A 
configuration of the system is defined as the assignation of states to the nodes 
at a given time, together with the position of the ant. 

The first thing to notice is that the rule is invertible (in the finite case, this 
implies that any configuration belongs to a periodic trajectory of the system). 
Moreover, the ant is its own inverse: if the ant turns back at some moment (for 
instance, when it comes to a leaf), the path to be followed afterwards will be 
exactly the reverse of the path it had followed before. 
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We note also that at nodes with degree 1, the ant reflects (the only edge it 
may use to leave is the same it used to arrive). At nodes with degree 2, it will 
go on, since the next-to-the-right and the next-to-the-left edges are the same. 

2 The Ant on Finite Graphs 

First we consider the case of trees. The idea is the following: the ant goes on, 
until it finds a leaf. At that moment, it will turn and undo its path, until it finds 
another leaf. It will oscillate between these two leaves, forever. 

Theorem 1. On a tree with diameter D, periods are bounded by AD, and the 
set of edges visited by the ant forms a simple path. 

We consider next a graph with no string in its cycles, i.e., such that each 
edge belongs to at most one simple cycle. Such a graph consists of a collection 
of simple cycles, which may be tangent to each other, or may be connected by 
paths. Two cases are to be considered: if there are no isthmuses, then the ants 
goes exactly twice through all the edges, in each period. If there are isthmuses, 
the graph is analyzed as a tree, where each node represents a component without 
isthmuses, and the result is a combination of the first case and Theorem 1 . 

Theorem 2. On a graph without strings, periods are bounded by 20|[/|. 



Theorem 3. There is a family of planar graphs Gn = {U„,E„), with \ Un\ = 2n, 
such that for each G„ there is a configuration with period greater than 2" . 



u 



n-1 



u 



n-2 






Fig. 1. The Period Grows Exponentially in the Size of these Graphs 



Gn is shown in Figure la; the arrow shows the initial position of the ant, and all 
the nodes start in the to-left state. Each visit of the ant to the pair {m, Vi} takes 
two visits to the pair {ui-i,Vi-i\] this makes the period exponential in n, and 
thus we see that we can have exponential periods once we drop the condition of 
Theorem 2 (absence of strings). That condition may seem too restrictive; never- 
theless, we And the following: if we keep that condition, but drop the condition 
of planarity -not even of the graph, but of the representation determined by the 
local left-right orientation- we may get exponential periods: the behavior of the 
ant over the graph of Figure lb is analogous to the case in la. 
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3 Circuit Construction 

We can impose a path to the ant, by putting the appropriate states in the nodes. 
If, in addition, we define the states of certain nodes as our logical variables, the 
ant will “read” them and choose its path accordingly. Now we will show how 
to use this to build a logical gate, where the output is “calculated” by the ant. 
The general form of the gate is described in Figure 2a: at the top, we have some 
nodes whose states represent the input. At the bottom, some nodes represent 
the output; at the beginning, all output nodes are in the to-left state, which will 
represent the logical value false. The ant enters the gate at the left, and exits 
at the right. While being in the gate, it visits the input nodes, and visits (and 
switches) the correct output nodes, according to the function which the gate 
represents. The changes are done from inside, thus allowing the output nodes to 
be used as the input for other gates. 




input 1 input 2 




output 



(b) 



Fig. 2. (a) Sketch of a Gate (b) XOR Function, Built as (~ {ii A i 2 )A ~ (~ 
*iA ~ 12)) 



To compute a boolean circuit we put the input variables in some nodes at 
the top of the configuration (see Figure 2b), and for the consecutive stages of 
evaluation we put consecutive rows of logical gates. The ant goes through every 
row, starting with the upper one. After going through the last row, the state of 
the last output node contains the evaluation of the circuit for the given input. 

To write a boolean circuit it is enough to have the NOT and the AND 
functions. To construct the circuit we also use gates that allow us to duplicate, 
cross and copy variables. All these gates are sketched in Figure 3. The general 
scheme is the following: the path of the ant bifurcates, depending on the input 
states. After (possibly) changing the output states, the paths are joined and the 
ant exits. In the next section, we show how to construct these configurations in 
bi-regular graphs with degree 3 or 4. 

3.1 Embedding in Infinite Regular Graphs 

First of all, we need to have paths for the ant to follow. With ant-path we will 
refer to a path which may be walked by the ant, provided that it encounters 
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NOT Copy 



input input 




output output 



AND 



input 1 input 2 




output 



Duplicate 



input 




output loutput 2 



Cross 



input 1 input 2 




output 2 output 1 



Fig. 3. Simplified Schemes of the Gates 



the appropriate states on it. In a 3-regular graph, any path is an ant-path, but 
in the general case this is not true. The next lemma shows that in T(fc,4), it is 
always possible to bring the ant from any location to any other location. We will 
see in Section 4 that this simple fact is not true in the r{k, d) graphs if d > 5. 

Lemma 1. Let P = vq,vi, ..,Vn be a simple path in r(k,4). Then there is a 
simple ant-path ao,...Om that begins at vq and ends at Vn- It is composed by 
edges that share a face with those of P, and it arrives to through an edge that 
is to the right or to the left from (vn-i,Vn), or is itself. 

For the schemes of Figure 3, we need to cross and join paths. To do it, we built 
Crossings and Junctions, which may be inserted at the places where they are 
needed. They are shown in Figure 4. In the Junction, if the ant enters at 1 or 
at 2, it exits at 3. In the Crossing, if the ant first enters at I, it exits at 2. If 
afterwards it enters at 3, it exits at 4. But if it enters first at 3, it exits at 5. 



Crossing 



Junction 



r(fc,3) 

J 

ri- 

^ 3 

• 

•• ,J 

' r ; / * ■ 



r(6,3) r(fc,4) r(4,4) 




Fig. 4. White Stands for to-left, Black for to-right 



Following Figure 3 and using the configurations of Figure 4, and simple paths, 
we define configurations that simulate the AND, NOT, Cross, Copy and Dupli- 
cate gates. We can choose the dimensions of these gates and the positions of their 
inputs and outputs arbitrarily, and this can be done in an automatic way. A pro- 
cedure that takes a boolean circuit and writes the corresponding configuration 
in a P{k,d) graph can thus be defined. 
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Figure 5 shows a Duplicate gate for F(4,4) and F(6,3) (the square and the 
hexagonal grids, respectively). The construction of the other gates for T(4,4) 
may be found in [15]. The fast growth of the configurations in hyperbolic graphs 
does not allow us to show them on Euclidean paper. 



Input 




Output 1 Output 2 



Input 



Output 1 Output 2 



ant's 

exit 



Fig. 5. Duplicate Gate in the Square and Hexagonal Grids 



3.2 Computational Complexity 

The problem (CIRCUIT- VALUE) of determining, given a boolean circuit C and 
a truth assignment t, whether C outputs true with input t, is known to be P- 
complete ([14], p.l68). Now, fix (k,d), with d = 3,4. From 3.1, for any pair 
(C,t) we can build a configuration in r{k,d) representing them, so that the 
ant will end the last row having visited or not having visited the output node 
of that row, depending on the outcome of C with input t. Thus the problem 
(CIRCUIT-VALUE) is being reduced to the problem (P) of knowing, for a finite 
initial configuration of P{k,d), whether the ant visits a given node a before 
another given node /3, or not. For T(4,4) (the square grid) we show in [15] that 
the reduction satisfies the conditions needed to make (P) P-hard; the case of 
P(6,3) (the hexagonal grid) is analogous. Taking only the part of the graphs 
which is being used for the construction of each circuit, we see that the problem 
(P’) of answering the same question for a given finite graph and a given initial 
configuration is also P-hard. 

3.3 Universality 

In a cellular automata (CA), a quiescent state is defined by the following pro- 
perty: if a cell and all its neighbors are in the quiescent state, the cell remains 
in it at the next iteration. Hence, all the dynamics of the system takes place 
at the cells in non-quiescent states and their neighbors. An initial configuration 
with a finite support (i.e., a finite number of non-quiescent states) will keep this 
property through the iterations of the CA. 

The transition rule of a CA can be calculated with a multi-output finite 
boolean circuit. So, for a given one-dimensional CA with quiescent state, we can 
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define an initial configuration on the grid consisting of infinitely many copies of 
this circuit, arranged in an infinite trapezoidal array with top row of length L, as 
shown in Figure 6. Any initial configuration of the CA whose support has width 
less than L can be written as the input of the first row, and the ant simulates the 
CA. For widths bigger than L, we just put the initial configuration in a lower 
row, and let the ant start running from the appropriate node. 



Cr^ 



061^61^63^636 

^ I R I R I R I - - I R I R I ^ 

^ ooJpi:ixm-fkll|-x x¥lx|-|6,x.oo, 

^ I R I R I R I R I - - I R I R I R I ^ 

OXlflXl Ixifixifix xlilxl lxlilbL -^0 

R I R I R I R I R I I R I R I R I R 






Fig. 6. The ant simulates each iteration of the CA in a row of gates, crosses the 
repetitions of the outputs (preparing the next input) and goes to the next row. 
R stands for the circuit that calculates the rule. 



The undecidability of some CA problems is inherited by the ant system. For 
instance, the problem of knowing whether a given (finite) word will ever appear in 
the evolution of a given one-dimensional CA, for a given initial configuration with 
finite support, is reduced to the problem of deciding whether a given finite block 
ever appears in the evolution of the ant, for a given infinite initial configuration 
of the grid. Since any Turing machine, in particular a universal one, can be 
simulated by a one-dimensional CA with quiescent state, the ant is also universal. 



4 Limitations in Highly Connected Graphs 

When the underlying graph has degree strictly greater than 4, the ant cannot 
reach all the nodes of the graph, given a fixed starting position. In the triangular 
grid, for instance, the unique simple path is a zigzagging line. 

For most of the following results, we will consider a generalization of the 
bi-regular graphs: A graph is said to verify (H), if all its nodes have degree > d, 
and its dual graph is k-regular, with d = 5 and k > or d> 6 and k >3. 

The proof of the following lemma is based on [16], and uses relations between 
the number of nodes, edges and faces enclosed by the cycle. Here we call ant-cycle 
an ant-path that is a cycle. 

Lemma 2. For a graph verifying (H), the unique simple ant-cycles are the faces. 

If one tries to design an ant-path so as to form a cycle different from a 
face, soon it is noticed that the origin, as well as an infinity of other nodes, are 
impossible to reach. This is exactly what Lemma 3 says. 
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Lemma 3. Let us consider a r{k,5) graph and Figure 7a. Then, no simple 
ant-path starting with the nodes (ao,b) may exit the shaded zone. 




(a) 




(b) 



Fig. 7. The boundary in (a) is composed by the two edges incident to ag to the 
right and to the left of (oo,&), and the edges found by adding recursively the 
edges adjacent to the last ones, so as to leave two edges inside of the zone. 

Applying recursively Lemma 3 to each edge of each simple ant-path, we 
obtain that in fact the simple ant-paths are restricted to the sub-graph shown 
in Figure 7b. This graph is a fractal tree of tiles, with degree k — 1. The nodes 
have degree 4 or 5 in this sub-graph, except oq, which has degree 3. Moreover, 
this fact is not only true for simple paths, and with the help of following lemma 
we can apply it to arbitrary ant-paths. 

Lemma 4. Let G be a graph verifying (H). Lf the ant begins between two nodes 
in the same state, then, a node that can be reached by the ant, can be also reached 
through simple ant-paths. 

But the ant is frequently between two nodes in the same state. Indeed, it 
cannot avoid this situation for more than k steps, for k consecutive equal states 
would bring it back to the first node. The ant will therefore be always restricted 
to a subgraph like the one described above. Since this sub-graph is defined in- 
dependently of the degree of the graph (only the lower bound is required), the 
result requires only (H), and we obtain the next theorem. 

Theorem 4. Let G be a graph verifying (H). Then, each time the ant is between 
two nodes in the same state, its future trajectory is restricted to the sub-graph 
depicted in Figure 7b. 

The following theorem shows that any problem related to the behavior of 
the ant over a finite configuration is decidable. In general, there are many cases 
where the trajectory of the ant turns out to be easily predicted, due to the 
restricted behavior, and in particular, the trajectory of the ant is found to be 
unbounded for any initial configuration. 
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Theorem 5. Let G be a graph verifying (H). Starting over a finite initial con- 
figuration, the ant always falls in a periodic motion with drift. The period of this 
eventual behavior is {k — 1)(A: + 1). 

5 Conclusions 

We studied the generalization of Langton’s ant to different planar graphs, with 
special emphasis in the point of view of complexity. Several constructions and 
formal results were obtained, and can be useful in future studies. 

In the general cases, a high degree of unpredictability was seen. A hint for 
this is the existence of families of finite graphs were the period of the system may 
grow exponentially with the size of the graph. A further hint is the existence of 
P-hard problems included in the prediction of the dynamics of the system, for 
the family of the finite graphs, and for the square and hexagonal grids. 

Infinite bi-regular graphs were studied, dividing them in two cases: first, the 
graphs with degree 3 or 4, and second, the graphs with degree equal or greater 
than 5. A natural reason for this division is found in Lemma 1 and Theorem 4: 
in the graphs of the second case, the ant cannot go from any location to any 
other location, whereas in the first case this is always possible. This difference 
seems to have deep implications, for the results obtained in the different cases, 
even if not directly contradictory, point towards different levels of complexity. 

In the first case (low degrees), which includes the classical square grid, a 
method for the evaluation of boolean circuits was found. This was used to show 
the universality of the system, and to show the existence of undecidable problems 
related to the trajectory of the ant. 

In the second case, there are strong restrictions for the trajectory of the ant, 
who can only walk on a tree of tiles. This forbids the construction of circuits 
like the ones in the first case. Moreover, its behavior is decidable for initial 
configurations with finite support. 
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Abstract. In this paper we study the k-station placement problem {k- 
SP problem, in short) on graphs. This problem has application to efficient 
multicasting in circuit-switched networks and to space efficient traversals. 
We show that the problem is NP-complete even for 3-stage graphs and 
give an approximation algorithm with logarithmic approximation ratio. 
Moreover we show that the problem can be solved in polynomial time 
for trees. 

Keywords: Multicasting, approximation algorithms, distributed 

systems, networks. 



1 Introduction 

In this paper we introduce and study the fc-station placement problem on graphs. 
Consider a communication network modeled by a weighted directed graph G = 
(V,E) where the length /(e) of an edge e is the cost of sending a message along 
that edge. Any vertex m of G that needs to communicate with another vertex 

V does so by first establishing a virtual circuit p from u to v along a path 
connecting u and v and then by sending the message. The cost of establishing a 
virtual circuit p is the sum of the lengths of the edges of p. Suppose now, that we 
are given a distinguished source vertex s and a set D of destination vertices and 
that s needs to multicast a message to the vertices of D. One possible way of 
performing this task would be for s to establish a virtual circuit with each vertex 

V of D along the shortest path between s and v. This approach has the advantage 
that transmission is achieved in one step but its cost might be very high: one 
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edge might be used by multiple circuits with the effect that its cost would be 
added with same multiplicity to the total cost. A different approach would be to 
identify a set Si of intermediate stations and assign each vertex of D to a vertex 
of Si - The communication takes places in two steps. First, s establishes a virtual 
circuit with each of the vertices of Si and transfers the message. In the second 
phase, each vertex of Ai establishes circuits with its assigned destinations so that 
the message is finally delivered to the destination vertices. In general, one can 
have k sets Si, - Sk of intermediate stations. The communication takes place 
in k steps: the source vertex establishes a circuit with each of the vertices of Si 
(level 1 stations); vertices of Si (level i stations), 1 < i < k, establish circuits 
with the vertices of Si+i (level i+1 stations) and finally vertices of Sk establish 
circuits with the vertices of D. 

There is a clear tradeoff between the number of steps needed to complete the 
multicasting (that is the number of intermediate stations along a path from s 
to a destination vertex) and the cost of the transmission: 1-step communication 
incurs in high cost; having each vertex as an intermediate destination yields 
minimum cost multicasting but it completely wastes the performance offered by 
circuit-switched network. 

In this paper we study the problem of allocating intermediate stations in a 
graph so that at most k station are encountered on a path from the source to a 
destination and the cost of multicasting is minimum. 

The fc-SP problem has also applications to the problem of of traversing an 
ordered binary tree T. Each vertex of the tree has pointers to the left and right 
child only and we are provided with a pointer to the root of the tree. The inorder 
traversal of the tree reaches all the leaves of the tree in time 0{n) (here n is 
the number of vertices of T). However, in the worst case, it needs f?(n) registers 
to store the addresses of the vertices of T for which the traversal has not been 
completed yet. This is hidden by the recursive approach often used to present 
the inorder traversal. Alternatively, one might consider the following approach. 
Each leaf u of a binary tree of depth h is uniquely identified by the binary string 
Path(u) of length at most h that describes the path from the root to v (0 stands 
for a link to a left child and 1 for a link to a right child). Thus, we have the 
following very simple algorithm: for each leaf v, start from the root of T and 
follow the path specified by Path(w). As it is easily seen this algorithm does not 
need any additional register to store addresses of vertices of the tree but, on the 
other hand, its running time is proportional to the path length of the tree that 
can be In general, one might ask to perform the fastest traversal of the 

tree given that only k registers are available. As it will be clear in the sequel, 
this problem is closely related to the fc-SP problem. 

The fc-Station Placement Problem. We now formally define the fc-Station 
Placement (fc-SP) problem. 



Definition 1 (The fc-SP Problem). An instance of the k-SP problem consists 
of a directed graph G = (V, E) , a length function £ defined over the edges of G, 
an integer k, a source vertex s and a set of destination vertices D. 
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A feasible solution to the k-SP problem (or k -placement) consists o/fc+1 sets 
of stations S\, - ■ ■ , Sk, Sk+i, with Sk+i = D, and k assignments • • • , 4>k, with 
(pi mapping vertices of Si+\ into vertices of Si U {s}. For every i, function (pi 
must satisfy the following property (that we will call strictness): for any v € S'i+i, 
the lightest path from v to <pi{v) does not contain any vertex in S'i+i other than 

V. 

The cost of a feasible solution P = {S\, ■ ■ ■ , Sk, (pi, ■ ■ ■ , <pk) is 

k 

Cost{P)='^ ^ w{(pi{v),v), 

i—0 vGSi+i 

where pe{v) = s for all v G Si and w{u,v) is the cost of the shortest path from 
u to V according to £. 

The task is to compute a feasible solution of minimum cost. 

The strictness property guarantees that the shortest path from the source 
to any destination node contains no more than k stations. If the property does 
not hold, we have a new problem we call the fc-Unrestricted Station Placement 
problem, or fc-USP problem. 

Going back to the example of multicasting in a network G, we observe that 
the minimum-cost fc-hop multicasting is obtained by solving the fc-SP problem 
on G. The /c-placement gives the k sets Si, - Sk of intermediate destinations 
and specifies, by means of the (pfs, the virtual connections each vertex of Si 
has to establish {i.e., v G Si has to establish a virtual circuit with each vertex 
u G Si+i such that (pi{u) = v). The cost of the fc-placement is the sum of the 
lengths of the circuits that are established to accomplish multicasting. 

Let us now briefly discuss how the fc-SP problem can be used to design the 
fastest traversal of a tree T using at most k registers to store pointers to vertices. 
Suppose we have the solution to the fc-SP problem on T with s equal to the root 
of T and the set D equal to the set of leaves of T. The traversal proceeds in 
the following way. From the root, we reach each of the vertices of Si. While at 
Si € S'!, we recursively traversal the tree rooted at si using k — 1 registers (one 
register is used to keep a pointer to si). It is easy to see that the cost of the 
fc-placement is equal to the time spent to perform the visit. 

Missing proofs and other generalizations of the problems can be found into 
the final version of the paper. 

Related Problem. The Steiner tree problem defined as follows shows some 
similarity to the fc-SP problem. 

Definition 2. Steiner Tree Problem 

INSTANCE: a simple graph G = (V,E), a weight function w(e) G N for each 
edge e G E, a target subset D CV of vertices. 

TASK: find a minimum weight subtree of G that covers all vertices in D. 

It is well known that this problem is NP-Complete ([ND12] in [4]). The fc-SP 
problem differs from the Steiner tree problem because of different cost functions. 
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However, in Section 4, we show how an approximate solution to the fc-SP can be 
derived from an approximate solution to a special variation on the Steiner tree 
problem. 

Roadmap. In Section 2, we show that the k-SP problem is NP-complete for 
any value of k, even if we consider multi-stage directed graphs with all the 
edges having the same length. In Section 3.1, we present a polynomial-time 
algorithm /c-SP-Tree for the special case of trees and in Section 3.3 we present 
an algorithm based on dynamic programming for the 1-USP problem on constant 
degree trees. In Section 4, we give approximation algorithms for the fc-USP 
problem. 



2 Hardness Result for Multi-stage Graphs 

In this section we first prove that the 1-SP problem is NP-Complete even on 
3-stage directed graphs by reducing Set Cover (SC for short) to this problem. 
Based on this, we also prove that for any k, the fc-SP problem is NP-Complete 
on (fc -I- 2)-stage directed graphs. We further show similar results for undirected 
multi-stage graphs. The decisional version of k-SP problem is the following: 

Definition 3. (Decisional fc-Station Placement (/c-DSP)) 

INSTANCE: (G, s, D, i, B) where G = (V, E) is a simple connected graph, s G P 
is the source, D C V is a set of destinations, £ : E ^ Af is a positive function, 
representing length of edges, and B is a positive integer. 

QUESTION: is there a feasible solution {S \, . . . , Sk, 4>ij • • ■ ) 4>k) to the k-SP prob- 
lem on G such that Cost{S \, . . . , Sk, 4>\, . . ■ , 4>k) < B? 

We now briefly recall the definition of decisional-SC and p-stage graphs and 
then we show the reduction of decisional-SC to 1-SP problem. 

Definition 4. Decisional Set Cover 

INSTANCE: (T, G, B) where C is a collection of subsets of a finite set T and B 
is an integer. 

QUESTION: is there a set cover for S (i.e., a subset C' Q C such that every 
element in S belongs to at least one member of C') of cardinality less or equal 
to B? 

The problem has been shown to be approximable within 1 -|- In |T| in [5] and 
not approximable within (1 — e) In |T| for any e > 0 unless NPcDTiME(n^°®^°®") 
in [3]. 

Definition 5 (p-Staged Graphs). A p-stage graph G = (V,E) is a directed 
graph whose vertices can be partitioned into p sets Vi, . . . ,Vp such that for every 
edge {u, v) € E, u € Vi and v G Vi+i or vice-versa for some i = 1, . . . ,p — 1. A 
weighted p-stage graph is a p-stage graph with weights on edges. A strong p-stage 
graph is a p-stage graph with edges directed from p to p+i for i = l,...,p— 1. 
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Theorem 1. 1-DSP on weighted strong 3-stage graphs is NP-Complete even if 
all the edges have the same weight. 

Proof. Obviously, the 1-DSP problem is in NP. We reduced Decisional-SC to 
1-DSP. Suppose I = (T, C, B) is an instance of Decisional-SC in which C = 
{Cl, . . . ,Cn} and, for i = l,...,n, Ci C T = {ti, . . . ,tm}. We construct a 
quintuple P = (G, s, D, i, B') such that if I' belongs to 1-DSP then I belongs to 
Decisional-SC, where G = (V, E) is a strong the following 3-stage graph: 



V — {s}U{Gi,...,G„}U{ti,...,tm} 

E = i(s,Ci) I i = 1, . . .n} U i(Ci,tj) I tj G Ci} 

D = T and, w.l.o.g., we assume that for any e G E, £(e) = 1. 

Let {S*,(j)*) be a 1-placement for G such that Cost{S* , (f*) < B \D\ for 
source s and set of destinations D = {ti, . . . ,tm}. 

Suppose, at first, that S* C C, then, by definition, S* is a set cover for C. 
Moreover, the cardinality of the set cover is less or equal to B: 

B + \D\ > Cost{S* ,4>*) = ^ w{s,v) ^ w{4>*{d),d) = ^ 1 -f ^ 1 = IS*! -I- |D|. 

d^D v^S* d^D 



Suppose, now, that S* ^ {Gi,...,G„}. Then, by the strictness property, 
either S* = {s} or S* contains some vertex in {G, . . . ,tm}- In both cases, we 
show how to construct, in polynomial time, a new feasible 1-placement (S', (f) for 
G such that S C C and Cost{S, (f) < C ost{S* , (f*)-. 

1. S* = {s} : Notice that C ost{S* , (j>*) = 2|D| because, for every d G D, 

w{s,d) = 2. Define a new function <j) : D ^ {Gi,...,G„} that associates, to 
every d G D, a, vertex Ci on one path from s to d. Let S = {<p{d)\d G D} and 
consider the new 1-placement by construction 

Cost{S, (f) = |S| -k \D\ < 2\D\ = Cost{S*,(t)*). 

2. There exists a node t such that t G S* r\{ti, . . . ,tm} ■ Also in this case, we 

construct a new feasible solution by substituting t in S* with one of its parent 
in C. As t has no outgoing edges, there is no destination vertex d G D, different 
from t, such that 4>*{d) = t and thus the cost of this new placement is less or 
equal to the cost of (S'*, </>*). □ 

We can prove that the problem is NP-complete also for undirected graphs. 
Theorem 2. 1-DSP on undirected weighted 3-stage graphs is NP-Complete. 

Moreover, by the known non-approximability results [3] of Set Cover we have 
the following corollary. 

Corollary 1. 1-DSP on weighted 3-stage graphs is not approximahle within (1 — 
e)ln|D| for any e > 0, unless NPcDTiME(n^°®^°®”), even if all edges have the 
same weight. 
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Lemma 1. For all i > 1, i-DSP reduces to {i + 1)-DSP. 

Proof. Let J = (G = {V,E),sG V,DC V,£ : E ^ J\f,B G Af) be an instance 
of the i-DSP problem. Construct graph G' = {V U {z\, Z2}, EU {(01, s), (s, Z2)}) 
and let E' ^ Af he the natural extension of £ to G' such that £'{zi,s) = 
1^1 ^v,uev w{v,u) and £'{s,Z2) > 0. Consider the following instance of (i -I- 1)- 
DSP problem: 1 ' = (G', Z\,D U {22}, £' , B + £'{zi,s) + £'{s, Z2)). 

We now show how to derive a feasible solution P = {81,82, . . . , 8i, (fi, . . . , (fi) 
to instance X given a feasible solution P' = {8[,8'2, ■ ■ ■ , . . . , to 

instance I' . By the strictness property we deduce that if s € then = {s} 
and, thus, given P' only the following cases can arise: 

1. = {s}: if there is only one station at the first level and this station is 
exactly s, restricting P' to graph G, we obtain a feasible solution P for X. 

2. S'j ^ {s}: we can construct a new feasible solution P" = (S'", 82, ■ ■ ■ , 8{^i, 
(j)2, ■ ■ ■ , to X', such that S" = {s} and, for every v G 82, </>"(u) = s. P" 

is a feasible solution to X', and Cost(P') > Cost{P")\ 

k 

Cost{P') = w{zi, 22) + w(zi,v) -t E E w{(t>'i{v),v) 

vGS[ 

k 

= /(21, s) -I- /(s, 22) -I- |S(|/(2i, s) -I- ^ (s, w) -I- ^ ^ w{ 4 >'i{v),v) 

k 

> £'{zi,s) +£'{s,Z2) + |S(|C(2i,s) -I- ^ w{(j>'i{v),v) + E E w{(l)'i{v),v) (1) 

k 

> /(21, s) -I- C(s, 22) -I- ^ w(s, w) -f ^ ^ w{ 4 >'i{v),v) = Cost{P") (2) 

where, to go from (1) to (2), we used the fact that £'{z\,s) > 
w{s,(j)'.^{v)). Restricting P” to graph G, we have a feasible solution P for X 
and 



B + £'{zi, s) + £'{s, Z2) > Cost{P") = £' {z\, s) + £'(s, 22) + Cost{P). 

□ 

From Theorem 1 and Lemma 1 we obtain: 

Corollary 2. k-DSP on weighted strong {k + 2)-stage graphs is NP-Complete 
even if all edges have the same weight. 

From Theorem 2 and Lemma 1 we obtain: 

Corollary 3. The k-DSP problem is NP-Complete on {k + 2)-stage graphs. 
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Finally, by Lemma 1, Corollary 1, we have the following non-approximability 
result: 

Theorem 3. The k-DSP problem is not approximahle within (1 — e)ln|Z)| for 
any e > 0 unless NPcDTiME(n^°®^°®”). 

3 Optimal Placement on Trees 

In this section we present polynomial-time algorithms for the fc-SP problem and 
the fc-USP on directed trees. We first present an algorithm for the 1-SP problem 
and, then, extend it to the general case fc > 1. We then show a simple dynamic- 
programming algorithm for the 1-USP problem. 

W.l.o.g., we can assume that the set D of destinations is the set of leaves in 
the tree T . Indeed, starting from T we can remove the leaves that are not in D 
and for any internal vertex v € D, we can add a new leaf d^, that becomes a 
new destination, and a new edge (v,dv) with cost zero, obtaining a new tree T'. 
It is easy to see that solving the problem on T is equivalent to solve the problem 
on T'. 

In the following, for any vertex v, we denote by T(v) the subtree rooted at 
V, by L(v) the set of leaves in T(v) and by p{v) the parent of v in T. 

3.1 Optimal 1-Station Placement on Directed Trees 

We present algorithm 1-SP-Tree for the 1-SP problem on a n-vertex tree T 
with source s and set of destinations D consisting of all the leaves of T. 

The algorithm associates, in 0{n) time, a cost c{u, v) to every edge {u, v) of 
the tree in the following way: 

c{u,v)'^w{s,v) + ^ w{v,d). (3) 

dGL(v) 

Referring to the fc-hop multicasting example, we can think that c(u, v) cor- 
responds to the cost of multicasting to the vertices of D n T{v) by placing one 
station at node v: the first term is the cost of sending the message from the 
source to v and the second is the cost of sending messages from v to the vertices 
of I? n T(v) without using any other intermediate station. 

Next, algorithm 1 -SP-Tree constructs a graph G by adding a new vertex 
t to T and by connecting the vertices of I? to t using infinite-cost edges. Then 
the algorithm computes a minimum cut of G with respect to source s and sink 
t. Let G be the computed cut; the algorithm outputs placement {Si,(j)i) such 
that: S'! consists of all the vertices v such that the edge {p{v),v) belongs to C; 
function (f)i assigns to each vertex u G D its closest ancestor that belongs to S. 

Theorem 4. Algorithm 1-SP-Tree, on input an n-vertex tree T, outputs an 
optimal solution to the 1-SP problem on T in time 0{M{n)), where M{n) is the 
running time of the fastest Min-Cut algorithm on n vertex graphs. 
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Proof. The analysis of the running time is obvious. Correctness follows from two 
observations: first, by construction, placement (5'i,(/)i) output by 1-SP-Tree 
is a feasible solution. Second, for every 1-placement P = {S, </>) on T, we can 
derive a legal cut C for G such that Cost{P) = Cost{C): cut C is simply 
composed by the edges {p{v ) , v) for every v G S. The proof now simply follows 
by contradiction. □ 



3.2 Optimal fc-Station Placement on Trees 

In the following, we say that, given a fc-placement P = (S'!, • • • , Sk, (fi, ■ ■ ■ , 4>k), 
the subplacement of P with respect to a vertex t G Si, denoted by P\t, is the 
restriction of P to T{t). We start with the following lemma. 

Lemma 2. Let P = {Si, - ■ • ,Sk,4>iG ' ' j4>k) be an optimal k-placement for a 
tree T, then for every station t G Si the subplacement P\t is an optimal (k — 1)- 
placement for the subtree rooted in t. 

Proof. By the way of contradiction, assume that P is an optimal /c-placement for 
a tree T and that there exists one vertex t G Si for which P|t is not an optimal 
{k — l)-placement and, thus, there exists a new placement P'\t with lower cost. 
We can, thus, construct a new placement P' for T, substituting P'\t to P\t, such 
that Cost(P') < Cost{P) contradicting the hypothesis. □ 

Algorithm /c-SP-Tree works in k phases: the first phase computes optimal 
solution for the 1-SP problem for T{v), for each vertex v; phase j > 1, computes, 
for every vertex v, an optimal j-placement for T{v) using the optimal {j — 1)- 
placements computed at the previous phase. In details: 

Phase 1: For every u in T compute an optimal 1-placement Pi{v) for T{v) and 
its corresponding cost. This is done by running algorithm 1-SP-Tree. 
Phase 1 < j < fc: For every node u in T compute an optimal j-placement Pj{v) 
for T{v) by defining costs for every edge {x,y) in T{v) in the following 
way: c{x,y) = w{v,y) + Cost{Pj-i{y)). Compute,then, a Min-Cut on this 
subtree (notice that Cost{Pj-i{y)) has been computed for every y during 
the previous phase). 

Phase k: Compute an optimal fc-placement for T defining new costs for every 
edge {u,v) in T in the following way: c{u,v) = w{s,v) + Cost{Pk-i{v)). 
Compute,then, a Min-Cut on T. 

The correctness of algorithm fc-SP-TREE follows directly from Lemma 2. 
Moreover observe that fc-SP-TREE has to solve 0{k ■ n) min-cut problem on n 
vertex graphs and, thus, its running time is 0{k ■ n • M{nf). 

Theorem 5. Algorithm fc-SP-TREE, on input a weighted tree with n vertices, 
a distinguished vertex s, a set of destinations D and an integer k, outputs a 
k-placement of minimum cost in time 0{k • n ■ M{n)). 
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3.3 Optimal 1-Unrestricted Station Placement on Trees 

We will now present algorithm 1 -USP-Tree based on dynamic programming to 
solve the 1-USP problem on a binary directed tree T where the source s is the 
root of T. For the sake of presentation, we assume the tree to be binary, but the 
results can be easily extended to constant degree trees. In the following we will 
say that a destination d is served by a station v if (j){d) = v. 

Notice, first, that an optimal 1-placement has an optimal substructure; in 
fact, with a proof analogous to the one of Lemma 2, we can prove that: 



Lemma 3. Given a tree T rooted in s and an optimal 1 -placement OPT for 
T, let u\ and U 2 be the children of s. Then, OPT\u^ and OPT\u^ are optimal 
1-placements for T{u\) U {s} and T{u 2 ) U {s}, respectively, where the source is s 
for both placements and L{u\) and L{u 2 ) are the destination sets, respectively. 



The proof is analogous to the one of Lemma 2. 

We now define the value of an optimal solution recursively in terms of the 
optimal solutions to subproblems. A subproblem is defined as determining the 
cost of a placement in a subtree rooted in v placing no more than k stations and 
serving all the leaves of T{v), except a given number A. 

Given tree T, let C{v,k,X) be the minimum cost of serving destinations in 
L(v), using at most k stations and knowing that there are exactly A leaves of 
T{v) that are served by some station placed in the up going path from v to s; 

i.e., these leaves will not be served by the k stations we will place in T{v). We 
define C{v, k, A) in the recursive following way: 

If u is a leaf, then C{v, k, 1) = 0, while for A yf 1, we have C{v, k, A) = -koo. 

If v is not a leaf, let ui and U 2 be its children. C{v, k, A) is calculated choosing 
the cheapest solution between placing or not placing a station in v and looking 
for the optimal solutions for T{ui) and T{u 2 ). This is done according to the 
following constraints: 

1. if we place (respectively not place) a station in v, then we can not place 
more than fc — 1 (resp. k) stations in the subtrees; 

2. if we place a station in v, then this station will serve / > 0 leaves of T{v), 
fi < \L{ui)\ of them in L{ui) and /2 < \L{u 2 )\ in L{u 2 ). 

3. if A > 0, then A = Ai -k A 2 leaves are served by a station ancestor of v such 
that 0 < Ai < \L{ui)\ are in T{ui) and 0 < A 2 < \L{u 2 )\ are in T{u 2 ). 

Now, let CY{v,k,\) (respectively Cjv(u,fc, A)) be the cost of placing (resp. 
non placing) a station in v and placing k — 1 (resp. k) stations in the subtrees, 
knowing that A > 0 leaves are served by an ancestor station. Then, 



C{v, k, A) = 



r min {Cy(v, k, X),Cn{v, k, X)} if A < |L(u)| 
1 -koo otherwise 



Both CY{v,k,X) and CN{v,k,X) are the sum of two terms: the first counts 
how many times edges (v, ui) and (v, U 2 ) are traversed; i.e., edge (v, Ui), i = 1,2, 
is traversed once for every station placed in T(ui), once for every leaf in L(ui) 
served by a station ancestor of v and once for every leaf in L(ui) served by v. 
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The second is the recursive call on T(u\) and T{u 2 ) with the proper value of the 
parameters. In details: 

1. If A = 0 then Cn{v, k, 0) is equal to: 



min 

0 < X + y < k 



{£{v, ui) ■ x + £{v, U 2 ) ■ y + C{ui,x,0) + C{u 2 , y, 0)} 



2. If A 7 ^ 0 then Cn{v, k, A) is equal to: 



min 

0 < X + y < k, 
Al + A 2 = A, 



Ai < \L{ui)\,i = 1,2 



{£{v,ui){x + Al) + i{v,U 2 ){y + A 2 ) +C(Mi,a:, Ai) + C(it 2 , y, A 2 )} 



In fact, if we do not place a station in v and we do not serve any leaf of an 
ancestor station, we simply look for the best way to distribute up to k stations 
in the subtrees. If we serve some leaves of an ancestor station, we have to find 
the cheapest way to distribute these too. 

Before giving the definition of Cy(v, k, A), we need the following lemma: 

Lemma 4. Given any optimal solution OPT = {S,(p) to the 1 — USP problem 
on binary tree T and a leaf I, there does not exist a vertex v G S, different from 
4>{l), that belongs to the unique path going from 4>{l) to 1. 

It is important to notice that the previous Lemma does not state that two 
stations cannot be on the same root-leaf path, but only that, in the optimal 
solution, leaves are served by the closest station. 

As a consequence of the Lemma 4, we define Cy{v, k, X) yf -l-oo only when 
A = 0. Thus, if A = 0, Cy (v, k, 0) is equal to 



min 

0<x-\-y<k — 1, 
fl+f2> 0. 

0< ft < \L(ui)\,i = 1,2 



{t(v, Ui)(x + /i) + i{v, U 2 )(y + / 2 ) + C(ui,x, /i) + C(u 2 ,y, / 2 )} 



In fact, if we place a station in v we only have up to fc — 1 station to place 
in the subtrees and we have to find most convenient set of leaves of T{v) to be 
served by v. 

Finally, the cost of an optimal solution OPT for tree T rooted in s is calcu- 
lated in the following way: 



Cost(OPT) = C(s, n, 0). 

To recover OPT, it is sufficient to remember the vertices in which we placed 
the stations and this gives us set S. Function (p is easily determinate using 
Lemma 4: given leaf /, </>(/) is the first vertex of S we find on the upgoing path 
from I to the root of the tree. 

Theorem 6. Algorithm 1-USP-Tree on binary trees runs in 0{n^). 

Proof. For every vertex w in T and each of the O(n^) pairs (x,y) such that 
X + y < n we have to consider the O(n^) pairs (Ai, A 2 ) such that Ai -I- A 2 < n. 
and the O(n^) pairs (/i, / 2 ) such that /i -I- /2 > 0 and fi,f 2 <n. □ 
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4 Approximation Algorithm on General Graphs 

In Section 2 we have shown that the fc-SP problems is NP-Complete. This implies 
the following: 

Corollary 4. The k-DUSP problem is NP-Complete for every k. 

In this section we show an approximation algorithm for this problem on 
general graphs. The key idea of the algorithm is to reduce the fc-USP to the 
problem of computing a Steiner tree with bounded depth on a graph. Let K\v\ 
be the complete graph over \V\ vertices in which the weight of edge (x, y) is the 
cost of the shortest path from a: to j/ in the graph G. The cost of the (fc+ l)-depth 
Steiner minimum tree of is equal to the cost of the optimal solution of the 
fc-USP problem on G. For the sake of presentation, we will present only the case 
k = 1, but similar arguments can be used for the case fc > 1. 

Lemma 5. Let T he a Steiner tree rooted at s for graph K\y\ with depth at most 
2 and target D. There exists a 1-placement P for the graph G with source s on 
destination set D such that Gost{P) = Cost{T). 

Proof. The placement P is constructed as follows: the set of stations consists of 
the vertices at level 1 in the tree T. For each vertex in v € D, 4>i{v) is the parent 
of V in the tree T. □ 

Using a dual argument it is possible to prove the following: 

Lemma 6. Let P he a 1 -placement for the graph G on destination set D and 
source s. There exists a Steiner tree T rooted at s with maximum height 2 on 
the clique K\y\ with target D such that Gost{P) = Gost{T). 

Lemma 7. Let P* he an optimal 1 -placement for a graph G on destination set 
D and let T* be a minimum Steiner tree on the complete graph of shortest paths 
in G. Lt holds that: 

Cost{P*) = Gost{T*) 

Proof. Assume, by contradiction, that Gost{P*) < Gost{T*). By Lemma 6, it 
is possible to construct a new Steiner tree T' on K\y\ with destination D such 
that Cost{T') = Gost{P*) < Gost{T*). But this contradicts the hypothesis that 
T* was a minimum Steiner tree. A dual argument can be used to prove that if 
Cost{P*) > Gost{T*), then P* is not an optimal placement. 

Since the problem of computing a Minimum Steiner tree is NP-Complete, we 
use an approximation algorithm for this problem in order to obtain an approxi- 
mation algorithm for the fc-USP. We recall the following results by Kortsarz and 
Peleg: 

Theorem 7. [6] Let G = (V,E) be a graph and let D C V he a set of des- 
tinations. There is an approximation algorithm for the minimum Steiner tree 
problem on G with destination D and maximum depth d with approximation 
ratio 0(log |D|), for any constant d, and 0{\D\^), for any e > 0, for general d. 
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Given the discussion above, the following theorems can be easily proven: 

Theorem 8. For any constant k, there exists approximation algorithm for the 
k-USP with optimal approximation ratio 0(log|D|). 



Corollary 5. For any k, and for any e > 0 there exists an 0(|I?|^) approxima- 
tion algorithm for the k-USP. 

5 Open Problems 

The immediate open problem left in our work is the design of better approxima- 
tion algorithms for general graphs and d. Also, we do not know of any natural 
class of graphs (other than trees) for which the problem can be solved efficiently. 

From a more combinatorial point of view it would be interesting to ask if 
there exists a function /(•) such that for all trees with n nodes the cost of the 
best /(n)-SP is 0{n). It is obvious that any function f{n) = I7(n) will do (just 
put a station in any vertex). We can show that f(n) = log* n works for complete 
binary trees but could not extend this result to general trees. 
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Abstract. We study the problem of learning an unknown function rep- 
resented as an expression over a known finite monoid. As in other areas 
of computational complexity where programs over algebras have been 
used, the goal is to relate the computational complexity of the learning 
problem with the algebraic complexity of the finite monoid. Indeed, our 
results indicate a close connection between both kinds of complexity. We 
focus on monoids which are either groups or aperiodic, and on the learn- 
ing model of exact learning from queries. For a group G, we prove that 
expressions over G are easily learnable if G is nilpotent and impossible to 
learn efficiently (under cryptographic assumptions) if G is nonsolvable. 
We present some partial results for solvable groups, and point out a 
connection between their efficient learnability and the existence of lower 
bounds on their computational power in the program model. For aperi- 
odic monoids, our results seem to indicate that the monoid class known 
as DA captures exactly learnability of expressions by polynomially many 
Evaluation queries. 



1 Introduction 

Formal models of the process of learning have been proposed since the 60s to give 
mathematical foundation to machine learning tasks, mostly to concept learning. 
The first models, such as Gold’s identification in the limit, were of recursion- 
theoretic flavor and did not emphasize efficient use of computational resources. 
In the mid 80s, several models that take time and memory into consideration 
were proposed, thus allowing the use of concepts and tools from computational 
complexity theory in the study of learnability. 

The resulting area, known as Gomputational Learning Theory, has produced 
an important number of results stating that certain classes of functions are or are 
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not efficiently learnable in rigorously defined learning models. But, in general, 
the properties that determine whether a class is easy or hard to learn are not yet 
well identified. We propose an algebraic approach that, in the long run, might 
help in clarifying such properties. 

Programs over finite monoids and other algebraic structures are models of 
computation that have been successfully used to expose the deep reasons why 
some computational problems can or cannot be solved within certain resources. 
In this paper we initiate the study of formal models of function learning from 
an algebraic point of view, i.e., we would like to determine the complexity of 
learning a class of functions from the classes of algebras that are powerful enough 
to compute the class of functions (in the program model or related ones). We 
concentrate on the case where the algebra M consists of an associative operation 
with an identity, i.e., when M is a monoid. 

So far, programs over monoids have been studied mostly as devices to com- 
pute boolean functions, and many circuit complexity classes have been shown 
to admit characterizations as those problems solved by programs over particular 
algebraic structures [4,6,5]. To avoid replicating this study, we look instead at 
programs over M as computing functions from M" into M . Expressions over M 
are a particular type of programs that appear very naturally in this context. 

We study the problem of learning an unknown target function from M" to M, 
for a fixed and known finite monoid M . It is assumed only that the function 
is computed by some expression on n variables over M. We work mostly in 
Angluin’s query-based model of exact learning [1,2], where algorithms can ask 
Evaluation queries, or Equivalence queries, or both. 

Note that this problem is not obviously comparable to the problem of learning 
the class of boolean circuits corresponding to programs over monoid M . This is 
because, on the one hand, the problem might be harder because the function class 
is richer. On the other hand, the answers to the queries provide finer information 
than in the boolean setting, namely, elements of the monoid, and this could help 
in learning. 

We present several results on the complexity of learning expressions over 
specific classes of monoids. We concentrate on monoids that are either groups or 
aperiodic, as these two classes are known to be the building blocks of all monoids 
via the so-called wreath product operation. 

Along the paper, we often say “a monoid M is learnable” for the sake of 
brevity, meaning “expressions over monoid M are learnable”, and similarly for 
a class of monoids. 

For the case of groups, we prove: 

— Expressions over Abelian groups are polynomial-time learnable both from a 
linear number of Evaluation queries and from a linear number of Equivalence 
queries. 

— Nilpotent groups (a generalization of Abelian groups) are learnable in poly- 
nomial time from Evaluation queries. 

— Solvable groups formed as extensions of a group Zp by a group Zq (for any 
p and q) are polynomial-time learnable from Evaluation and Equivalence 
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queries in the form of Multiplicity Automata, and learnable in probabilistic 
polynomial time from Evaluation queries alone. 

— A slightly larger subclass of solvable groups can be identified by a probabilis- 
tic strategy using polynomially many Evaluation queries, and by a determin- 
istic strategy using quasipolynomially many Evaluation queries (though no 
claims are made regarding computation time). We show that, in fact, these 
or similar results will hold for any group for which we can prove a certain 
type of lower bound on its computation power; in other words, such lower 
bounds on computation power provide upper bounds on learning complexity. 

— Expressions over nonsolvable groups are not learnable unless NC ^circuits 
also are, even in the very strong learning model of PAC-prediction with 
Membership queries [3]. Recall that under plausible cryptographic assump- 
tions, NC ^circuits are not learnable in this or any other standard model of 
learning [3]. 

In the aperiodic case, our results involve the class of aperiodic monoids known 
as DA [15]. For some algorithmic problems on monoids, it is known that feasi- 
bility depends essentially on membership to DA. For example, the membership 
problem is known to be PSPACE-complete for any aperiodic monoid outside of 
DA [7]. Also, the word problem for an aperiodic monoid can be resolved in sub- 
linear communication complexity (in the 2-player setting) iff the monoid belongs 
to DA [14]. Our results are: 

— Expressions over DA monoids are identifiable from polynomially many Eval- 
uation queries (though possibly not in polynomial time). 

— For a subclass of DA, idempotent R-trivial monoids, we can in fact give a 
polynomial-time learning algorithm using Evaluation queries. 

— It is known that there are exactly two minimal aperiodic monoids not belong- 
ing to DA. We show that expressions over any of these two minimal monoids 
are not learnable with subexponentially many Evaluation queries, even with 
arbitrary computation time. We conjecture the same is true for any monoid 
outside of DA, because it is also known that every monoid outside of DA is 
divided by at least one of these two minimal monoids. 

Certainly the picture is still partial, with many upper and lower bounds missing. 
But our results seem to indicate a very close connection between the complexity 
of the monoid (in the algebraic sense) and its learning complexity. 

Finally, let us comment on the relevance of these results for the mainstream of 
computational learning theory. A good deal of the effort in this theory has been 
on learning classes of boolean functions, and especially those inside NC ^since it 
seems hopeless to try to learn any larger one. As mentioned, many of the central 
complexity classes definable by small-depth circuits can be also defined by pro- 
grams over central classes of monoids. For example, polynomial-length programs 
over aperiodic monoids compute exactly the functions in AC°, programs over 
solvable groups compute the functions in CC ° (polynomial-size circuits made of 
modq gates for a fixed q), and nonsolvable groups (or monoids) compute all func- 
tions in NC^[4,6]. Formally, we do not know how to translate neither positive 
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nor negative results in our model to the boolean case. On the other hand, in 
the learning context, various small fragments of AC °and CC °are known to be 
learnable, and NC ^is known not be learnable in a strong sense, in fact a situation 
quite in analogy with the results in this paper. We believe they are interesting 
as they provide a new perspective on learning problems near the borderline of 
current knowledge. 

Due to lack of space, this extended abstract does not contain the proofs of 
our results. The full version of the paper can be found on the respective home 
pages of the authors, http://www.lsi.upc.es/gavaldaand 
http : //www. cs .mcgill . ca/denis. 



2 Preliminaries 

2.1 Expressions and Programs over Monoids 

A monoid is a pair (M, •) where M is a set and •, the product over M , is a binary 
associative operation on M with an identity element. A group is a monoid where 
each element has a (two-sided) inverse with respect to •. A monoid is aperiodic 
if it has no submonoid which is a non-trivial group. All monoids considered in 
this paper will be finite. We look at two computation models based on products 
over a monoid: expressions and programs. 

An expression over monoid M and variables xi, . . . , is a string over the 
alphabet M U {xi, . . . , x„}. Such an expression defines quite naturally a function 
from M" to M: to evaluate the (function represented by the) expression over a 
vector or assignment {wi,W2, ■ ■ ■ ,Wn) in M", replace with Wi each occurrence 
of each variable Xi in the expression, then multiply out the resulting string of 
monoid elements to obtain a single monoid element. For example, assume that 
a, 6, c, d are four elements in M. Then the value of expression 0x2X1X26x30x1X36? 
on the assignment (c, b, a) is the element a-b-c-b-b-a-c-c-a-d, where • is the 
product in M . 

Expressions are a particular case of programs. A program over M with domain 
D is a list of instructions of the form (i,f), where i G {!,..., n} and / is a 
function £> 1— > M. Instruction (i, /) is interpreted as follows: read the value of 
variable Xi and append /(xi) to the string of monoid elements to be multiplied. 
Hence, expressions are programs whose instructions use only constant functions 
and the identity function. 

In the literature, programs have been used mostly to compute boolean func- 
tions [ 4 , 6 ]. In these boolean programs, domain D is {true, false} and M is parti- 
tioned into True and False sets to interpret a boolean result. Expressions could 
also be used to compute boolean functions, say by encoding true and false in- 
puts by distinct elements of the monoid. We will be here mainly interested in 
programs and expressions computing functions from M" into M, rather than 
boolean functions. 
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2.2 Learning Expressions over Monoids: Problem Statement 

In this section we define the learning problem we study. We use mostly Angluin’s 
query-based model of learning [1,2], although Valiant’s PAC model [17] or related 
ones are mentioned occasionally. All definitions are standard for function learning 
in these models, although we give them only in the terms of our specific problem, 
learning expressions over monoids. For background in computational learning 
theory, the reader is referred to the books, surveys, and bibliography in the 
recently created server [13]. 

The task of a learning algorithm (or learner) is to find out a target function 
M" to M fixed by a teacher in an adversary way. The function is assumed to 
be representable as an expression over M and variables x\, . . . , though not 
all variables are necessarily used. The learning algorithm is initially given n and 
some upper bound m on the length of a shortest expression for the target.^ 
Monoid M is fixed and known both to the teacher and the learning algorithm. 

The learning algorithm must produce an expression (or some other represen- 
tation of the target function, as discussed later) equivalent to the target one on 
all of M”. 

To achieve learning, teacher and learner exchange information on the target 
function following some protocol; some specific protocols will be defined later. 

Resources used by the algorithm are measured as a function of n and m. The 
class of expressions over M is learnable in time t{n, m) in a given learning proto- 
col if there is an algorithm that learns every expression over M in time t(n, m). 
In particular, we study mostly whether expressions over a class of monoids is 
polynomial-time learnable, meaning whether for each monoid M in the class 
there is a polynomial p(n, m) such that expressions over M are learnable in time 
p{n, m). 

Similarly, we say that expressions over a monoid M are identifiable with in- 
teraction s{n,m) if there is an algorithm that learns every expression over M 
using an amount of interaction s(n, m) with the teacher (and arbitrary compu- 
tation time). The meaning of “amount of interaction” may be different in each 
learning protocol, but in general it has to be bounded by the number of bits of 
information exchanged by the teacher and the learning algorithm. Identifiability 
thus represents the information-theoretic cost of solving a learning task, ignoring 
the computational complexity of the problems that the learning algorithm has 
to solve internally at each stage of the process. 

Let us stress that we assume that monoid M is fixed and known to the 
learning algorithm, and thus we regard jMj as a constant. We often present 
algorithmic schemes to learn whole classes of monoids whose running time is 
exponential (or more) in the size of the monoid. We still call these algorithms 
“polynomial-time” as long as their dependence on n and m is polynomial. A 
stronger notion of “polynomial-time learnability” of a monoid class would ask for 
an algorithmic scheme depending only polynomially on \M\. In an even stricter 

^ Symbols n and m will always have this meaning in the paper, i.e., nnmber of variables 
and an upper bound on the length of shortest expression for the target. 
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sense, we could ask for an algorithm that receives the multiplication table of 
M as part of the input, with the promise that M belongs to the monoid class, 
and is polynomial in |M|, n, and m; this would be truly “uniformly” learning 
of class, in the sense that the algorithm does not rely on hardwired information 
for each monoid. 

As for interaction between teacher and learner, we consider two standard 
query types in Angluin’s model: Evaluation and Equivalence queries. Let the 
target / be a function from set A to set B. In an Evaluation query, the learning 
algorithm produces an element a G A and the teacher must return f{a).^ In an 
Equivalence query, the learning algorithm produces a hypothesis h, representing 
a function A i— > i? in some way; the teacher must return Yes if = / on A, or 
else a counterexample: an element a G A such that /(a) yf h{a), together with 
the value of /(a). If hypotheses issued by the algorithm always belong to the 
same syntactic class of functions that is being learned, the algorithm is called 
proper. Otherwise, hypotheses may belong to a different and possibly richer class, 
and the algorithm is called nonproper. An important requirement on any such 
hypothesis class is that it must be polynomial-time evaluatable, i.e., that a given 
hypothesis can be evaluated on a given input in polynomial time. 

Before investigating the learnability of specific classes of monoids, let us 
consider a quite general question. There are several constructions for building 
monoids from other monoids. The most natural ones are direct product, sub- 
monoid, and homomorphic image. It is a natural question whether learnability 
is preserved by these operations. We only have very partial answers so far. 

Proposition 1. 

1. For the three models of query learning (Evaluation queries only, Equivalence 
queries only, or both) the following is true. If expressions over monoids S and 
T are polynomial-time learnable, then expressions over S x T are learnable 
(possibly nonproperly). 

2. If expressions over T are polynomial-time learnable from Equivalence queries 
and S is a submonoid of T, then expressions over S are polynomial-time 
learnable from Equivalence queries. 

We do not know whether learnability under Evaluation queries is preserved 
by taking submonoids. The difficulty is that the algorithm for the larger monoid 
may ask queries on T"\ S'"; the teacher, knowing only a target function S" i-^- S, 
is not able to answer these. In fact, the query is ill-posed as the answer may 
be different for different T-expressions defining the same target function over 
S. Under homomorphic image and either type of query, the problem lies in 
inverting the homomorphism on answers to queries in a way that is guaranteed 
to be consistent with some expression over the larger monoid. 

^ Evaluation queries generalize Membership queries [1,2] for functions with non-binary 
range. 
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3 Abelian Groups 

Abelian or commutative groups are the simplest from an algebraic point of view. 
Quite naturally, they are easiest from the learning point of view, in the sense 
that they are learnable with a linear number of either Evaluation or Equivalence 
queries. 

Theorem 1. 

a) Expressions over an Abelian group G are learnable with n + 1 Evaluation 
queries. 

b) Expressions over an Abelian group G are learnable with 0{n) Equivalence 
queries. 

4 Nilpotent Groups 

Let a and b be elements of a group G and let [a,b] = aba~^b~^ denote the 
commutator of a and b. These are the commutators of weight 2. Commutators 
of weight 3 are [a, [6, c]] and [[a, &],c] and commutators of weight k are defined 
inductively in the obvious way. We say that G is nilpotent of class-fc iff all 
commutators of weight fc + 1 are the identity, and observe that any commutator 
of any weight involving the identity it itself the identity. 

It is clear that nilpotent groups of class 1 are exactly the Abelian groups. 
And indeed several properties of nilpotent groups are natural generalizations of 
those for Abelian groups. For example, it can be shown that n- variable functions 
that are realizable by programs over nilpotent groups of class k can always be 
represented (in the sense of [5]) by polynomials of degree k (with coefficients 
in an appropriate ring). Expressions over nilpotent groups are learnable from 
Evaluation queries alone. As in the Abelian case, the learning algorithm is based 
on the fact that programs can be rewritten to a normal form, although the 
transformation is more involved in this case. 

Theorem 2. Expressions over a nilpotent class-k group G are learnable with 
|G|*n^ Evaluation queries and ^ time. 

For Equivalence queries, an approach like that in Theorem 1, part (b), seems 
difficult since it would involve solving polynomial equations over cyclic groups. 



5 Solvable Groups 

In this section we will present some partial results on learnability in solvable 
groups. For any two subsets A and B of a group G, denote by [A, B] the subgroup 
generated by all commutators [a,b], with a S A,b G B. We can then form 
the descending chain of subgroups Gq,Gi,... by setting Go = G and Gi = 
[Gi_i, Gi_i]. The group G is solvable iff this chain goes to the trivial subgroup. 
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In order to present our results concerning solvable groups, we have to recall 
the following notion. Let G be a group, and suppose iL is a normal subgroup of 
G. Then G is said to be an extension of H by the quotient group G/H and it 
admits the following representation. 

We view elements of G as pairs in iL x G/H. For any pairs (hi,gi), (/12, 52) G 
G, product in G can be expressed as: 

{hl,gi) -G (^2,52) = {hi -H /gi, 92(^2), ffl 'G/H 92)- 

Functions /gi,g2 '■ H ^ H iov each 51,52 G G/H are called the “twist functions” 
for G and describe the interaction between the two components of the group. 

Our results in this section concern solvable groups which are extensions of 
Zp by Zq, where Zp and Zq are any cyclic groups. Note that, for example, the 
group S'3 of permutations on three points is an extension of Z^ by Z2. 

From now on, we view elements of a group G as above as pairs in Zp y. Zq. 
For an element 5 G G, we use notations Zp{g) and Zq{g) to denote the first and 
second elements of the pair associated to 5, that is, we identify 5 with the pair 
{Zp{g),Zq{g)). 

The learning algorithm for these groups uses Multiplicity Automata as hy- 
pothesis class. Multiplicity Automata over rings (MA for short) are an important 
generalization of classical automata. They were first used in the context of learn- 
ing by Bergadano and Varricchio [10], who gave a polynomial-time algorithm for 
learning MA over fields by Equivalence and Evaluation (there called Multiplic- 
ity) queries. Later, Bshouty, Tamon, and Wilson [12] extended the algorithm to 
work over a large class of rings instead of fields, including all finite integer rings. 

The algorithm for MA has been used to learn several other classes of func- 
tions [9,8]. In particular, [9] uses the MA learning algorithm to learn some classes 
of boolean circuits with modular gates and boolean permutation branching pro- 
grams of width at most 4. These results are probably related to the connection 
between some solvable groups and MA that we find here. 

We give here a working definition of Multiplicity Automata; for more sys- 
tematic presentations see [10,12]. An MA over an alphabet E and a ring AT is a 
nondeterministic finite automata where each transition triple (5, a, q') {a G E) 
is additionally labeled by an element of K. To each path in the automata we 
associate the value in K given by the product of all the labels along the path. 
The MA computes a function M : E* 1— > AT in the following way: for each input 
w G E*, M{w) is the sum of the values of all nondeterministic paths defined by 
input w on the MA. A particular case of MA that we use here is when E = K 
and the MA is evaluated on inputs of a fixed length n, so that M : AT" 1— > K. 

Multiplicity Automata are able to simulate expressions over the groups above. 
The proof is based on a somewhat careful study of the structure of their twist 
functions. 

Theorem 3. Let G be a group which is an extension of Zp by Zq. Then there 
is a function f \ Zq ^ Zp such that for every expression E{xi, . . . ,Xn) over 
G there is a multiplicity automata M over Zp of size 0 {n\E\) such that for all 
ai, . . . ,an G G, 
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Zp{E{ai,. . . , a„)) = M(Zp(ai), . . . , Zp(a„), /(Z,(oi)), . . . , /(Z,(a„))). 

Furthermore, for all vectors u\, . . . ,Un G Zp and vi, . . . ,v„ G Zp such that some 
Vi is not in the range of f , 

M{ui, . . . ,Un,Vi,. . . ,Vn) = 0 . 

Then, using the learning algorithm for MA over rings [12], we can prove: 

Theorem 4. For every group G as above, expressions over G are learnahle 
in polynomial time using Evaluation and Equivalence queries. The Equivalence 
queries and the output of the algorithm are pairs formed by a Multiplicity Au- 
tomata over the ring Zp and an expression over Zq. 

When p is prime, the groups we have been working with in this section can 
in particular be viewed as special cases of wreath products of Abelian groups by 
p-groups. This larger family has been studied in [5] where it was shown that any 
group of that form could not possibly compute the AND function via a program 
of subexponential length. Moreover, these groups were also shown to have the 
following property. 

Say that a solvable group G is non-narrowing if there is a polynomial p(m) 
such that for all program P and all a G G, ii there is an assignment w G G" 
such that P(w) = a then there are at least |G|”/p(lenpt/i(P)) such assignments. 
This property gives an identification strategy by polynomially many Evaluation 
queries (not necessarily a polynomial-time algorithm as we don’t know how to 
efficiently obtain a hypothesis from the answers) . 

The remaining results in this section were obtained through discussions with 
Gris Moore. 

Theorem 5. If G is non-narrowing, then programs (hence, expressions) over G 
can be identified probabilistically from Evaluation queries in polynomial time. 

With the same argument, it is easy to show that an Equivalence query to the 
groups above can be simulated with high probability by polynomially many ran- 
dom Evaluation queries. We can combine this observation with the Equivalence 
and Evaluation query algorithm in Theorem 4. 

Corollary 1. Any solvable group which is an extension of Zp (p prime) by Zq 
is learnable from Evaluation queries in probabilistic polynomial time. 

We finally observe that an exponential lower bound on the length of programs 
over G that compute the AND function translates into a quasipolynomial upper 
bound on the number of Evaluation queries needed to identify a program over 
that group. 

Theorem 6. If programs over G cannot compute the AND function in subexpo- 
nential length, then programs over G can be identified from Evaluation 

queries. 

Note that it is conjectured in [5] that the exponential lower bound on the 
AND function holds for all solvable groups. 
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6 Hardness of Nonsolvable Groups 

In this section we now show that expressions over nonsolvable groups are not 
polynomial-time learnable unless NC ^circuits are polynomial-time learnable too. 
Hence, under the cryptographic assumptions in [3], they are not learnable at all. 

Theorem 7. Expressions over nonsolvable groups are not learnable from Evalu- 
ation and Equivalence queries with any polynomial-time evaluatable class, unless 
NC^ circuits also are 

The proof works by reducing learnability of nonsolvable groups to that of 
boolean programs over simple non- Abelian groups, which are polynomially equiv- 
alent to NC ^circuits [4]. 

Conceptually, there are three parts in the reduction: 1) learning expressions 
over some nonsolvable group implies learning expressions over some simple non- 
Abelian group; this does not follow trivially from the fact that a nonsolvable 
group contains a simple non-Abelian one, because the learning algorithm for 
nonsolvable might conceivably use queries outside the simple non-Abelian one 
to learn it; 2) over a simple non-Abelian group, programs and expressions com- 
pute the same functions; 3) over a simple non-Abelian group, programs simulate 
boolean programs in a prediction-preserving sense. 

7 Aperiodic Monoids 

Theorem 8. Expressions over a monoid in DA are learnable from a polynomial 
number of Evaluation queries and unbounded computation time. 

Although the answers to these many Evaluation queries identify uniquely the 
target, they give no obvious way to predict the value of the target on a different 
input. For a small subclass of DA we know how to reconstruct efficiently an 
expression for the target from these answers. 

Theorem 9. Expressions over idempotent R-trivial monoids are learnable in 
polynomial time from Evaluation queries. The same is true for aperiodic com- 
mutative monoids. 

It is known that there are exactly two minimal monoids outside of DA, named 
U and BA 2 . Monoid U is the syntactic monoid of the language (aa*b)*, has 6 
elements, and is known to be universal because programs over it can simulate 
DNF formulas [16]. Monoid BA 2 is the syntactic monoid of (ab)*, has 6 elements 
also, and is provably not universal. 

We reduce the problem of learning expressions over U from Evaluation queries 
reduces to the problems of learning monotone DNF formulas from Membership 
queries, which requires exponentially many queries [11]. Similarly, learning ex- 
pressions over BA 2 reduces to learning singleton sets by Membership queries. 

Theorem 10. Expressions over monoid U are not learnable from subexponen- 
tially many Evaluation queries, even using unbounded computation time. The 
same holds for monoid BA 2 . 
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Abstract. It is known that random fc-SAT instances with at least cn 
clauses where c = Cfc is a suitable constant are unsatisfiable (with high 
probability). We consider the problem to certify efficiently the unsatis- 
fiability of such formulas. A result of Beame et al. shows that fc-SAT 
instances with at least n*^“^/logn clauses can be certified unsatishable 
in polynomial time. We employ spectral methods to improve on this: 
We present a polynomial time algorithm which certifies random fc-SAT 
instances for fe even with at least 2'° ■ {k/2Y ■ (Inn)’^ • 
clauses as unsatisfiable (with high probability). 



Introduction 

We study the complexity of certifying unsatisfiability of random fc-SAT instances 
(or fc-CNF formulas) over n propositional variables. (All our discussion refers to 
fc fixed and then letting n be sufficiently large.) The probability space of random 
fc-SAT instances has been widely studied in recent years for several good reasons. 
The most recent literature is [A2000],[Fr99], [Be et al98]. 

One of the reasons for studying random fc-SAT instances is that they have 
the following sharp threshold behaviour [Fr99j: There exists a constant c = Ck 
such that for any e > 0 formulas with at most (1 — e) • c - n clauses are satisfiable 
whereas formulas with at least {l + e)-c-n are unsatisfiable with high probability 
(that means with probability tending to 1 when n goes to infinity). In fact, it is 
by now not proven that Ck is a constant. It might be that cu = Ck{n) depends on 
n. However, it is known that Ck is at most 2^ • In 2 and the general conjecture is 
that Ck converges to a constant. For formulas with at least 2^ • (In 2) -n clauses the 
expected number of satisfying assignments of a random formula tends to 0 and 
the formulas are unsatisfiable with high probability. For 3-SAT instances much 
effort is spent to approximate the value of C3. The currently best results are that 
C3 is at least 3.125 [A2000] and at most 4.601 [KiKrKrSt98] . In [Du et al2000] 
it is claimed that C3 < 4.501. (For fc = 2 we have C2 = 1 [ChRe92], [Go96].) 
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The algorithmic interest in this threshold is due to the empirical observation 
that random fc-SAT instances at the threshold, i.e. with around c^n random 
clauses are hard instances. The following behaviour has been reported consis- 
tently in experimental studies with suitably optimised backtracking algorithms 
searching for a satisfying assignment, see for example [SeMiLe96] [CrAu96]: The 
average running time is quite low for instances below the threshold. For 3-SAT 
instances we observe: Formulas with at most 4n clauses are satisfiable and it is 
quite easy to find a satisfying assignment. A precipitous increase in the average 
running time is observed at the threshold. For 3-SAT: About half of the formulas 
with 4.2n clauses are satisfiable and it is difficult to decide if a formula is sat- 
isfiable or not. Finally a speedy decline to lower complexity is observed beyond 
the threshold. For 3-SAT: All formulas with 4.5n clauses are unsatisfiable and 
the running time decreases again (in spite of the fact that now always the whole 
backtracking tree must be searched.) 

There are no general complexity theoretical results relating the threshold to 
hardness. The following observation is trivial: If we can efficiently certify almost 
all instances with dn clauses where d is above the threshold as unsatisfiable, then 
we can efficiently certify almost all instances with d'n clauses where d' > d as 
unsatisfiable by simply chopping off the superfluous clauses. The analogous fact 
holds below the threshold, where we extend a given formula by some random 
clauses. Analogous observations apply to the number of literals in clauses. 

The relationship of hardness and thresholds is rather general and not re- 
stricted to satisfiability. It is known for fc-colourability of random graphs with a 
linear number of edges. In [PeWe89] a peak in running time seemingly related to 
the threshold is reported. The existence of a threshold is proved in [AcFr99] but 
again the value and convergence to a constant are only known experimentally. 
For the subset sum problem which is of a quite different nature we have also this 
relationship between threshold and hardness: The threshold is known and some 
discussion related to hardness is found in [ImNa96]. 

Abandoning the general complexity theoretic point of view and looking at 
concrete algorithms the following results are known for random fc-SAT instances: 
All progress approximating the threshold from below is based on the analysis 
of rather simple polynomial time heuristics. In fact the most advanced heuristic 
being analysed [A2000] only finds a satisfying assignment with probability of at 
least £ where e > 0 is a small constant for 3-SAT formulas with at most 3.145n 
clauses. The heuristic in [FrSu96] finds a satisfying assignment for 3-SAT almost 
always for 3-SAT instances with at most 3.003n clauses. On the other hand the 
progress made in approximating the threshold from above does not provide us 
at all with efficient algorithms certifying the unsatisfiability of the formula at 
hand. Only the expectation of the number of satisfying assignments is calculated 
and is shown to tend to 0. 

In fact beyond the threshold we have negative results: For arbitrary but 
fixed d > 2* • In 2 random fc-SAT instances with dn clauses (are unsatisfiable 
and) have only resolution proofs with an exponential number, that is with at 
least (l-|-e)" = 2^^"^ clauses [ChSz88]. This has been improved upon by [Fu98], 
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[BePi96], and [Be et al98] all proving (exponential) lower bounds for somewhat 
larger clause/variable ratios. Note that a lower bound on the size of resolution 
proofs provides a lower bound on the number of nodes in any classical backtrack- 
ing tree as generated by any variant of the well known Davis-Putnam procedure. 

Provably polynomial time results beyond the threshold are rather limited 
by now: In [Fu98] it is shown that /c-SAT formulas with at least clauses 
allow for polynomial size resolution proofs and thus can be certified unsatisfiable 
efficiently. This is strengthened in [Be et al98] to the best result known by now: 
For at least n*“^/log n random clauses a backtracking based algorithm proves 
unsatisfiability in polynomial time with high probability. (In fact the result of 
Beame et al. is slightly stronger as it applies to formulas with log n) 

random clauses.) 

We extend the number of clauses for which a provably polynomial time algo- 
rithm exists. We give an algorithm which works when the number of clauses is 
only n to a constant fraction on k (with high probability) . Our algorithm certifies 
fc-SAT instances for k even with at least 2^ • (kj^'f ■ (Inn)"^ • 
clauses as unsatisfiable. We thus get the first improvement of existing bounds 
for A: = 4. To obtain our result we leave the area of strictly combinatorial al- 
gorithms considered by now. Instead we associate a graph with a given formula 
and show how to certify unsatisfiability of the formula with the help of the 
eigenvalue spectrum of a certain matrix associated to this graph. Note that the 
eigenvalue spectrum can be calculated in polynomial time by standard linear 
algebra methods. 

Eigenvalues are used in two ways in the algorithmic theory of random struc- 
tures: They can be used to find a solution of an NP-hard problem in a random 
instance generated in such a way that it has a solution (not known to the al- 
gorithm) . An example for 3-colourability is [AlKa94] . They can also be used to 
prove the absence of a solution of an NP-problem. However these applications 
are somewhat rare at the moment. The most prominent example here is the 
expansion property of random regular graphs [AlSp92] . Note that the expansion 
property is coNP-complete [B1 et al81] and the eigenvalues certify the absence 
of a non-expanding subset of vertices (which is the solution in this case). Our 
result is an example of the second kind. 

1 From Random Formulas to Random Graphs 

We use the following notation throughout: Form„_fc,m is our probabilistic model 
of /c-SAT instances with m clauses over n propositional variables. Most of the 
time we assume that k is even. Form„^fc_m is defined as follows: The probability 
space of clauses of size k, Clause„,fc , is the set of ordered fc-tuples of literals 
over n propositional variables vi, . . . , v„- We write l\ \J . . . \J Ik with li = x or 
li = where x is one of our variables. Our definition of Clause„^fc allows for 
clauses containing the same literal twice and clauses which contain a variable 
and its negation in order to simplify the subsequent presentation. We consider 
Clause„_fc as endowed with the uniform probability distribution: The probability 
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of a clause is given by P{li V ... V Ik) = (l/(2n))^. Form„^fc^mis the m- 
fold Cartesian product space of Clause„_fc . We write F = Ci A ... A Cm and 
P{F) = (l/(2n))^ ’”. There are several ways of defining /c-SAT probability 
spaces. Our results refer to these spaces, too. We discuss this matter after the 
presentation of the algorithm. 

Our algorithm uses the following graphs: 

Definition 1. Let F G Formn,k,m be given. The graph G = Gp depends only 
on the sequence of all-positive clauses of F : 

— The set of vertices of G is V = Vp = {x\ V ... V Xk /2 \ Xi a variable}. We 
have \V\ = nf!"^ and V is independent of F. 

— The set of edges ofG,E = Ep is given by: For two different vertices V. . .V 

Xfc/2 and j/iV. . .Vj/fc/2 we have that {xiV .. .V Xk/2, 2/i V. . .Vj/fc/2} G FI Zjff 

xiV. . .Va;fc/ 2 Vi/iV. . .\/yk /2 (oryiV . . .Vi/fc/ 2 Va;iV. . .VXfc/ 2 ) is a clause of F. 
Note that it is possible that |i?| < m as a clause might induce no edge or 
two clauses induce the same edge. Our definition does not allow for loops or 
multiple edges. 

The graph Flp is defined in a totally analogous way for the all-negative clauses 
ofF. □ 

Recall that an independent set of a graph G is a subset of vertices W of G 
such that we have no edge {v, re} in G where both v,w G W. The independence 
number of the graph G denoted by a{G) is the number of vertices in a largest 
independent set. It is NP-hard to determine a{G). 

Lemma 2. If F G Formn,k,mis satisfiable then 

a{Gp) = (l/2)'=/2 • |P| or a{Hp) > {1/2)'^/'^ ■ \V\. 

As k remains constant when n gets large this means that we have independent 
sets consisting of a constant fraction of all vertices ofGp of Hp. 

Proof. Let A be an assignment of the n underlying variables with the truth 
values 0, 1 (where 0 = false and 1 = true) which makes F true. We assume 
that A assigns 1 to at least n/2 variables. Let S be this set of variables then F 
has no all-negative clause consisting only of literals over S. Therefore Hp has an 
independent set with at least > (1/2)^/^ vertices. If the assignment 

assigns more than half of the variables a 0 the analogous statement applies to 
Gp. □ 

In the subsequent discussion we refer mainly to Gp. Of course everything applies 
also to Hp. We need to show that the distribution ot Gp is just the distribution 
of a usual random graph. To this end let be G„^mbe the probability space of 
random graphs with n labelled vertices and m different edges. Each graph is 
equally likely, that is the probability of G is P{G) = ^/{^}}}). 
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Lemma 3. (1) Conditional on the event in Fornin,k,m that \Ep\ = r the graph 
Gf is a random member of the space Gi,^r where v = nfl"^ is the number of 
vertices of Gp- 

(2) Let £ > 0. For F G Formn,k,m the number of edges of Gp is between m • 
(1/2)^ • (1 — e) and m ■ (1/2)^ • (1 + e) with high probability. 

Proof. (1) Let V = {x\ V . . . V Xfc /2 | a variable} be the set of vertices. Let 
G = (V, E) be a graph with \E\ = r. We show further below that the probability 
of the event that F G Form„_fc_m induces the edges set E, denoted by P{F;Ep = 
E), depends only on r, but is independent of the actual edge set E. This implies 
the claim because 



P{F;Ef = E\\Ep\ = r) 



P{F-Ep = E) 
P{\Ef\ = r) 




where the last equation holds because P{F]Ep = E) is independent of E and 
therefore must be the same for all E with r edges. 

It remains to show that P{F] Ep = E) is independent of E. To this end we 
show that 



P(F; Ep = E\E has exactly s all-positive clauses) 
is independent of E. This implies the claim because by conditioning 



P{F;Ep = E) = ^ P{E] E has s positive clauses ) 

s>0 

•P(F; Ep = E\E has s positive clauses). 

The distribution of Forinn^k.m conditional on the set of formulas with exactly 
s all-positive clauses is the uniform one. We therefore just need to count the 
number of formulas P with exactly s positive clauses such that Ep = E. 
Each such formula F with Ep = E is obtained exactly once by the follow- 
ing choosing process: 1. Pick a sequence of s positive clauses with k literals 
{Gi, . . . ,Gs) G (Clause„,fc )^* which induce the edge set E. 2. Pick s positions 
from the m positions available and put the clauses {Ci, . . . ,Gg) from left to 
right into the corresponding slots. 3. Fill the remaining m — s positions of E 
with clauses containing at least one negative literal. 

For 2 edge sets E,E' with r edges there is a natural (but technically not easy 
to describe) bijective correspondence between the (Ci, . . . , C^) for E and the 
(C(, . . . , C() for E' picked in step 1. Therefore the number of choosing possibil- 
ities is independent of the actual set E and we are done. 

(2) The claim follows from the following statements which we prove further 
below: 

— Let £ > 0 be fixed. The number of all-positive clauses of P G Form is between 

(1 — e) • (1/2)* • m and (1 -I- e) • (1/2)* • m with high probability. 
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— The number of all-positive clauses like xi V . . . V cCfc /2 V xi V . . . V Xfc/ 2 , that 
is with the same first and second half, is o(m). 

— The number of unordered pairs of positions of F on which we have positive 
clauses which induce only one edge, that is pairs of clauses {xiV . . .V Xk, yiV 
. ..Vyk} where {xiV . . .V Xk/ 2 , Xk/ 2 +iV . . .V Xk} = {yiV . . .Vyk/ 2 , yk/ 2 + 1 '^ 
• • • V t/fc} is also o(m) with high probability. 

This implies the claim of the lemma with e slightly lower than the e from the 
first statement above because we have only o(m) clauses inducing no additional 
edge. 

The first statement: This statement follows with Chernoff bounds because the 
probability that a clause at a fixed position is all-positive is (1/2)^ and clauses 
at different positions are independent. The second statement: The probability 
that the clause at position i has the same first and second half is (1/n)*/^. The 
expected number of such clauses in a random F is therefore = o{m). 

The third statement: We fix 2 positions i yf j of F. The probability that the 
clauses at these positions have the same set of first and second halves is 2 • (1/n)* 
and the expected number of such unordered pairs is at most • 2 • (1/n)* = 
0{mjn) provided m = which we can assume. Let X be the random 

variable counting the number of unordered pairs of positions with clauses with 
the same first and second half and let e > 0. Markov’s inequality gives us 

P{X > rf ■ EX) < EXjijf • EX) = Ijrf. 

Therefore we get that with high probability X < n'^ ■ {m/n) = o(m). □ 

2 Spectral Considerations 

Eigenvalues of matrices associated with general graphs are somewhat less com- 
mon at least in Computer Science applications than those of regular graphs. The 
monograph [Ch97] is a standard reference for the general case. The easier regular 
case is dealt with in [AlSp92]. The necessary Linear Algebra details cannot all 
be given here. They are very well presented in the textbook [St88]. 

Let G = {V, E) he an undirected graph (loopless and without multiple 
edges) with V = {1, . . . , n} being a standard set of n vertices. For 0 < p < 1 we 
consider the matrix A = Aa,p as in [KrVu2000] and [Ju82] which is defined as 
follows: 

The (n x n)-matrix A = Ac,p = {ai,j)i<i,j<n has Uij = 1 iff {i, j} ^ E 
and Oij = —(1 —p)/p = 1 — 1/p iff {i,j} G E. In particular Oiy = 1. As A 
is real and symmetric A has n real eigenvalues when counting them with their 
multiplicities. We denote these eigenvalues by Ai(A) > A 2 (A) > • • • > A„(A). 
Now we have an efficiently computable upper bound for a{G): 

Lemma 4. (Lemma 4 of [KrVu2000]) For any possible p Ai(Ag_p) > a{G). 

Proof. Proof: Let I = a{G). Then the matrix Ac,p has an I x /-block which 
contains only I’s. This block of course is indexed with the vertices from a largest 
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independent set. It follows from interlacing with a suitable I x n-matrix N (cf. 
Lemma 31.5, page 396 of [vLWi]) that \i{Ac,p) is at least as large as 1. This is 
the claim. □ 

In order to bound the size of the eigenvalues of Ac,p when G is a random 
graph we rely on a suitably modified version of the following theorem: 

Theorem 5. (Theorem 2 of [FuKoSl]) Let for I < i, j < n and i < j Oij 
be independent , real valued random variables (not necessarily identically dis- 
tributed) satisfying the following conditions: 

— \oij\ < K for all i < j, 

— the expectation Eoi^i = v for all i, 

— the expectation Eoij = 0 for all i < j, 

— the variance Vaij = E[a(j] — {EaijY = * < j> 

where the values K, v, a are constants independent of n. 

For j > i let = Oij and let A = be the random (n x n)- 

matrix defined by the atj. Let the eigenvalues of A be Ai(A) > A 2 (AI) > ••• > 
Xn{A). With probability at least 1 — (l/n)^° the matrix A is such that 

Max{|Ai(A)| 1 1 < i < n} = 2 • cr • • logn) = 2 • cr • • (1 + o(l))- 

□ 

We intend to apply this theorem to a random matrix A = Ac,p where G is a 
random graph from the probability space Gn,m- However, in this case the entries 
of A are not strictly independent and Theorem 5 cannot be directly applied. We 
first consider random graphs from the space Gn,p and proceed to Gn,m later on. 
Recall that a random graph G from Gn,p is obtained by inserting each possible 
edge with probability p independently of other edges. 

For p constant and G a random member from Gn,p the assumptions of The- 
orem 5 can easily be checked to apply to Ac,p- However, for sparser random 
graphs that is p = p{n) = o(l) the situation changes. We have that Oij can 
assume the value — l/o(l) -I- 1 and thus is not any more bounded above by a 
constant. The same applies to the variance: cr^ = {l—p)/p= l/o(l) — 1. 

It can however be checked that the proof of Theorem 5 as given in [FuKoSl] 
goes through as long as we consider matrices Ac,p where p = (lnn)^/n. In 
this case we have that K = n/(lnn)^ — 1 and a = n/(lnn)’^ — 1. With this 
modification and the other assumptions just as before the proof of [FuKoSl] 
leads to: 

Corollary 6. With probability at least 1 — (l/n)^° the random matrix A satisfies 
Max {|Ai(H)| |l<t<n} = 2- CT - ^/n 0(n/(lnn)^^/®) 
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Proof. We sketch the changes which need to be applied to the proof of Theorem 
2 in [FuKoSl]. These changes refer to the final estimations of the proof on page 
237. We set 

k := = (lnn)^/®(l + o(l)), 

in fact k should be the closest even number. We set the error term 

V := 50 • n/(lnn)^^/®. 



We have 

2 ■ a ■ \/n = 2 • n/(lnn)’^^^ = 2 • n/(lnn)^^^® 
which implies that v = o(2 • a ■ \/n). Concerning the error estimate we get 

V ■ k 50 • (Inn)”^/® 



2 • cr • ^/n + V (Inn)^/® 

This implies the claim. 



(1 + o(l)) = 50 • Inn • (1 + o(l)). 



Together with Lemma 4 we now get an efficiently computable certificate bound- 
ing the size of independent sets in random graphs from Gn,m- 

Corollary 7. Let G be a random member from Gn,m where m = ((lnn)^/2)-n. 
and let p = rnj{ff!^ = (lnn)’^/(n— 1). We have with high probability that 

Xi{Ag,p) < 2 • (l/(lnn)"/2) . „ . (i + ^(l)). 



Proof. The proof is a standard transfer from the random graph model Gn.p 
to Gn,m- For G random from Gn,p the induced random matrix Ac^p satisfies 
the assumptions of the last corollary. We have that with probability at least 
1 — (l/n)^° the eigenvalues of Ac,p are bounded by 2 • (l/(lnn)’^/^) • n-(l-ko(l)). 

By the Local Limit Theorem for the binomial distribution the probability 
that a random graph from Gn,p has exactly m edges is of C(l/(n • = 

i7(l/(lnn)^/^). This implies the claim as the probability in Gn,p that the eigen- 
value is not bounded as claimed is 0((l/n)^°) = o(l/(ln (We omit the 
formal conditioning argument.) □ 



3 The Algorithm 

We consider the probability space of formulas Form = Form„_fc^m where k is even 
and the number of clauses is 

m = 2^- {Inn'^/y ■ = 2^= • (fc/2)^ • (Inn)^ • 

Given a random formula F from Form the algorithm first considers the all- 
positive clauses from F and constructs the graph Gp. From Lemma 3 we know 
that G = Gp is a random member of G^^^ where v = nf!'^ and p > m - (1/2)^ • 
(1 — e) = {\nvy - V ■ {1 — e), where we fix e > 0 sufficiently small, in fact e = 1/2 
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will do. In case the number of edges is smaller than this bound the algorithm 
fails. 

The algorithm determines the matrix A = Ac,p where p = > 

{Ixivy /{v — 1). ^From Corollary 7 we get that with high probability 

Ai(A) < 2-(l/(ln:/)^/2).j..(l + o(l)) < 

for n sufficiently large. In case the second inequality does not hold the algorithm 
fails. By Lemma 4 Gp has no independent set with (1/2)*/^ • v vertices. 

The algorithm proceeds in the same way for the all negative clauses and the 
graph Hp. In case it succeeds (which happens with high probability) we have 
that F is unsatisfiable by Lemma 2. 

In case we want to apply this algorithm when the number of literals per 
clause k is odd we first extend each clause by a random literal. The algorithm 
succeeds when the number of clauses is 2*+^ • {{k + l)/2)’^ • (Inn)^ • 

Some technical matters come up when this algorithm is applied to other 
fc— SAT probability spaces used in the literature. The first problem arises when 
the formulas are defined such that clauses are not allowed to contain the same 
literal several times. This implies that certain edges are excluded from the graph 
Gp and we cannot any more speak of a random graph. 

The probability that a random clause from our space Clause„,fc has the same 
literal several times is bounded from above by • (1/n)) = 0{l/n) and 

bounded from below by 1 /n. Thus the expected number of clauses with the 
same literal several times in a formula from the space Form„_fc_m is 0(m/n). Re- 
call that m > for our algorithm to work so there are quite a few clauses 
with double occurrences. By the Local Limit Theorem for the binomial dis- 
tribution with parameters m and 0(1 /n) the probability that a formula from 
Form„_fc^m has exactly the expected number of clauses with double occurrences 
is I2(l/(m/n)^/^). Let p = and m = 0{{\nvy ■ v) then still we have that 
0{l/vY^) = o{l/ {Invy ■ l/(m/n)^/^) cf. the proof of Corollary 7. 

Now, given a random sequence of clauses F' without double occurrences of 
literals we add randomly exactly the expected number of clauses with double 
occurrences of literals to get the formula F . Then we apply our algorithm to the 
resulting formula. With high probability (by the above local limit consideration) 
the algorithm certifies that Gp has only independent sets with o{v) vertices given 
the number of clauses of F is m = 2^ • (In vY -v. After deleting the edges of Gf 
which are induced by the 0{mln) double occurrence clauses any independent 
set can only increase by 0{2mln) vertices. This implies that we still have no 
linear size independent set of vertices in G'p. This and the same consideration 
for the graph FI pi certifies the unsatisfiability of F' . 

The remaining variants of probability spaces (clauses as sets, formulas as 
sets, picking each clause with a probability p) can more easily be dealt with. 

Conclusion 

By now a large part of the algorithmic theory of random structures is concerned 
with efficient algorithms finding solutions to an NP-problem. Often the proba- 
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bility spaces used are designed in such a way that we know a solution is present 
and the algorithm then must find it (or any other solution) . 

The present paper is concerned with the complementary aspect. We certify 
efficiently the absence of a solution to an NP-problem which we know not to be 
present by non-efficient means. It seems that this aspect is by now somewhat 
neglected in the algorithmic theory of random structures. We think it deserves 
more attention as it may lead to natural questions about natural probability 
spaces. Spectral methods are one way to deal with these problems. A paper in 
the same spirit is the recent [KrVu2000] where the non-existence of a colouring 
with a given number of colours is certified by spectral methods. 

One problem which can be directly treated based on the ideas developed here 
is the 3-colouring problem of sparse random graphs: For random graphs with c-n 
edges the following facts are known: For c < 1.932 graphs are 3-colourable with 
high probability [AcMo97]. For c > 2.522 graphs are not 3-colourable [DuZi98] 
[AcMo]. There is a sharp threshold [AcFr99] with experimental hardness. 

The results of the present paper imply that for c = c{n) > 1/2 • (Inn)^ we 
can efficiently certify that we do not have any more an independent set with n/3 
vertices (with high probability). Therefore we have no 3-colouring. 

The following two problems however seem to require new ideas: First, the 
efficient certification of unsatisfiability of formulas with less than clauses. 
The problem here is that the average degree in the graph Gp now is o(l) and 
the bounds on the eigenvalues make no sense. Second, to improve the bound of 
n^/logn known for 3-SAT. 
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Abstract We study the circuit complexity of generating at random a word of 
length n from a given language under uniform distribution. We prove that, for 
every language accepted in polynomial time by 1-NAuxPDA of polynomially 
bounded ambiguity, the problem is solvable by a logspace-uniform family of 
probabilistic boolean circuits of polynomial size and 0(log^ n) depth. Using a 
suitable notion of reducibility (similar to the NC^-reducibility), we also show the 
relationship between random generation problems for regular and context-free 
languages and classical computational complexity classes such as DIV, L and 
DET. 

Keywords: Uniform random generation, ambiguous context-free languages, aux- 
iliary pushdown automata, circuit complexity. 



1 Introduction 

Given a formal language L, the uniform random generation problem for L consists of 
computing, for an instance n > 0, a word of length n in L uniformly at random. We 
study the circuit complexity of this problem for several classes of languages including 
regular, context-free (c.f for short) and more generally languages accepted by one-way 
nondeterministic auxiliary push-down automata (1-NAuxPDA). 

Several sequential algorithms have been proposed for the random generation of 
strings in regular and context-free languages [12, 10, 9, 11]. The problem is particu- 
larly interesting in the c.f case because these languages can codify a wide variety of 
combinatorial structures; moreover, sampling words from c.f languages is naturally 
motivated by other applications such as testing parsers of programming languages [12] 
or evaluating the performance of algorithms which process DNA sequences [20, 19]. 

In the case of unambiguous c.f languages the best known algorithm for random gen- 
eration works in 0{n log n) arithmetic time [10]; this is a special case of more general 
procedures for the random generation of so called “labelled combinatorial structures”. 
In the case of general (possibly ambiguous) c.f languages a subexponential time algo- 
rithm is described in [ 1 1] for the (almost uniform) random generation of strings of given 
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putational models: syntactic and combinatorial methods”. 
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length. The problem is solvable in polynomial time if the language is generated by a 
c.f grammar of polynomially bounded ambiguity [4]. This result also holds for lan- 
guages accepted by polynomial time 1-NAuxPDA of polynomially bounded ambiguity 
and, under suitable hypotheses, a similar approach can be applied to the combinatorial 
structures that admit an ambiguous specification (in the sense that the same object may 
have several distinct descriptions). 

In this work we give a classication of the circuit complexity of these problems which 
includes languages described by possibly ambiguous specifications. 

Our most general result states that for every language accepted by a polynomial 
time 1-NAuxPDA of polynomially bounded ambiguity the uniform random generation 
problem can be solved by a log-space uniform family of probabilistic boolean circuits 
of polynomial size and 0(log^ n) depth. This, in particular, emphasizes the difference 
between counting and random generation: indeed, for some finitely ambiguous c.f lan- 
guages the counting problem is #Pi complete [3]. 

Stronger results can be obtained for less general and well-known classes of lan- 
guages such as regular and context-free languages. To compare the complexity of our 
problem for such classes, we give a natural extension of the usual NC^ reducibil- 
ity [7]. We say that the uniform random generation problem for a language L is RNCj- 
reducible to a class of boolean functions if it can be solved by a logspace-uniform 
family of probabilistic boolean circuits of polynomial size and 0(log n) depth using 
oracle nodes in Using this notion we show the relationship between our problem 
and classical computational complexity classes such as DIV, DET and #SAC^ [7, 21] 
(here defined in Section 2). 

We show that, for every regular language the problem of uniform random generation 
is RNCg -reducible to the class DIV; moreover, in case of unambiguous c.f languages 
the problem is RNCj -reducible to DIV U L and, for polynomially ambiguous c.f lan- 
guages it is RNCg-reducible to #SAC^. Finally, we consider a general version of the 
uniform random generation problem for regular languages, where the deterministic fi- 
nite automaton describing the language is part of the input; in this case, the problem 
is RNCg-reducible to DET. These results are obtained by combining the complexity 
of counting and recognition problem with the study of some reachability problems on 
certain random graphs arising from the design of the circuits. 

2 Probabilistic Circuits for Random Generation 

We assume some familiarity with (bounded fan-in) boolean circuits as defined in [7, 
22]. We say that a family {c„}„>o of boolean circuits is uniform if there exists a log- 
space bounded Turing machine which on input 1" computes a description of c„. The 
class NC* is the set of boolean functions computable by uniform families of boolean 
circuits of polynomial size and 0(log^ n) depth, where n is the input size. A boolean 
function / is -reducible to a boolean function g, if / can be computed by a uniform 
family of boolean circuits of polynomial size and 0(log n) depth equipped with oracle 
nodes for computing g; here, the depth of any oracle node with fan-in i and fan-out o 
counts for [log(i -F o)] . Given a class of boolean functions, we denote by NC^(‘^) 
the closure of under NC^ reducibility. 
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Let intdet and intdiv be the problems of computing respectively the determinant of 
n X n matrix of n-bit integers and the division of two n-bit integers. As usual, we de- 
note by L (NL) the class of languages recognized in 0(log n) space by a deterministic 
(nondeterministic) Turing machine. Hence, the classes L*, NL*, DET and DIV are de- 
fined respectively by L* = NC^(L), NL* = NC^(NL), DET = {{intdet}) and 
DIV = NC^ ({intdiv}). The following relations are known [7]: 

NC^ C ^ C DET C NC^ 

Finally, by #SAC^ we denote the set of functions computing the number of accept- 
ing subtrees in a uniform family of semi-unbounded circuits of polynomial size and 
0(log n) depth [21]; we also recall that #SAC^ C NC^. 

In this work we use boolean circuits to solve uniform random generation problems. 
To this end we use the notion of probabilistic boolean circuit as introduced in [7]. This 
is a boolean circuit equipped in addition with independent and identically distributed 
random input bits: each of them assumes a value in {0, 1} with probability 1/2. 

Example 1 . Consider the problem of generating at random an integer according to some 
specified distribution. Let m, 02 , . . . , a„ be n-bit positive integers, we design a proba- 
bilistic boolean circuit c„ which, on input m, 02 , . . . , a„, outputs ak G {1, 2, . . . , n| U 
{±1 such that: 

1. Prjfc = -L| < 1/4, 

2. for every 1 < i < n, Prjfc = i \ k ^ 1.} = ai/a, where a = 

First of all, the circuit computes in parallel all Si = 'Yhj<i Oj, for 1 < i < n; then 
it computes ^ = minji : s„ < 2*}. Let now ri, r 2 G {1,2,..., 2^} be two random 
integers defined by two distinct sets of f random input bits each. The circuit computes 
in parallel kj = min{i : rj < for j = 1,2 (where we assume min 0 = _L). Finally 
it outputs ki if this is different from _L, else it outputs /c 2 - 

Clearly, the probability of giving _L as output is less than or equal to 1 /4 while, 
if this is not the case, the output has the required distribution. Recalling the circuit 
complexity of elementary arithmetic operations [22], one can conclude that the size of 
the circuit is polynomial and its depth is 0(log n). 

Notice that, by taking m = parallel copies of the same circuit, one can solve 
the problem, still in polynomial size and 0(log n) depth, reducing the probability of 
answering _L to 1/4™ at most. □ 

We now introduce a parallel hierarchy to classify the uniform random generation 
problem for formal languages. 

Definition 1. A uniform family of probabilistic boolean circuits {c„}„>o is a uniform 
random generator (u.r.g.) for a formal language L C S*, if each Cn, on input 1", 
computes a value Un in V" U {_L} such that, if L H V" 0, then: 

1. Pr{w„ = ±}< 1/4, 

2. Prjwn = X I _L} = 1/#(L C 27"), for every x G L D 27". 
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Moreover, we say that the uniform random generation problem for L belongs to the 
class RNCg if there exists a u.r.g. for L of polynomial size and 0(log^ n) depth. 

Observe that this class is not the usual class RNC^ [7], since here we are not interested 
in computing a boolean function bounding the probability of a wrong answer, but we 
rather want to produce a random output with a given distribution explicitly notifying 
the possible failure of the computation (due to the restriction to unbiased random bits). 

We say that the uniform random generation problem for a language L is RNCj- 
reducible to a class of boolean functions if there exists a u.r.g. for L of polynomial 
size and 0(log n) depth which uses oracle nodes in ‘tf (again, the depth of any oracle 
node with fan-in i and fan-out o counts for [log(i -b o)]); we denote by RNCg(‘^) the 
class of uniform random generation problems RNCj -reducible to 



3 Regular Languages 



In this section we study the circuit complexity of the uniform random generation prob- 
lem for regular languages. We show the problem to be RNCg-reducible to intdiv. 

Let = [E, Q, (Jo, F, S) be a deterministic finite automaton and define, for g G Q 
and 0 < f < n, the language = {cc G : S{q,x) G F} and set rj{q,£) = 
(where, as usual, = {e}). 

We start by defining a family of (random) graphs which allows to design the circuits 
for solving our problem. For every integer n > 0, define the (direct acyclic) labelled 
graph G„{£/) = (Vn,En) such that Vn = {{q,H) : q € Q,0 < i < n} and En is 
built according to the following procedure: for every v = {q, £) G Vn with f > 0 pick 
Oy G E at random such that, for every a G E, 



Pr{ay = a} 



T]{6{q,a),e- 1) 
r]{q,e) 



and add to En the edge {{q, £), {6{q, cr„),f — 1)) with label 

Since G„(j 2 /) is acyclic and all nodes (q,£) with f > 0 have out-degree 1, for 
every {q, £) G Vn and 0 < m < £ there exists just one node reachable from {q, £) 
through a path of length m. Let u>{q, £) be the word consisting of the labels along the 
path leaving {q,£) of length £: i.e. w(( 7 , £) = a\ - ■ ■ Ui, where q\ = q, gi+i = b(gi, af) 
and {{qi,£ — i -b 1), {qi+i,£ — i)) G En, for 1 < i < f. Reasoning by induction on 
1 < f < n, one can prove that Pr{ixi{q,£) = x} = l/ri{q,£), for every 0 and 

every x G L^. Hence, we obtain the following 

Lemma 1. For every n > 0 such that L{sV) n Z'" ^ 0, 

Pr\uj{qo,n) = x| = ,, — 

^ ^ ^ #(L(i2/)nr") 



for every x G L(jV) n E". 

We now show that, if the automaton jV is fixed, given 1" and Gn(s^) as input, 
computing the word uj{qo, n) belongs to NC^. To this aim, we need some preliminary 
tools. 
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We say that a nd x nd boolean matrix A is {d, f) -upper-diagonal if A is a block 
matrix of the form A = (Aij), where all Aij are d x d matrices such that Aij ^ 0 
(the zero matrix) iff j = i -b f (d, t > 0, i, j = 1, . . . , n). 

Observe that, for every pair of nd x nd boolean matrices A, B, if A is {d, s)-upper- 
diagonal and B is (d, f)-upper-diagonal, then the product AB is (d, s + f)-upper- 
diagonal: 



{AB),,j 



j — i -f (5 -f t), 
0 otherwise; 



moreover, AB can be obtained by computing in parallel n — {s + t) many products of 
d X d matrices. For this reason, we can prove the following 



Lemma 2. Let d > 0 be a fixed integer. If A is a (d, s)-upper-diagonal boolean matrix 
of size nd x nd, then computing the boolean power A" on input A belongs to NC^. 



Proof. Observe that A^ is (d, 2s) -upper-diagonal and can be computed by a boolean 
circuit of polynomial size and constant depth. So, for every i > Q, is a (d, 2*s)- 
upper-diagonal matrix and can be computed in polynomial size and 0{i) depth. Then 

A”= n 

i:bi—l 

where bi G {0, 1}, for 0 < i < [lognj, are the digits of the binary expansion of n, 
i.e. n = 6^2*. Hence can be obtained by a product of a logarithmic number 

of upper-diagonal matrices. Such a product can be computed in polynomial size and 
0(log log n) depth. □ 

Since all the edges of G„(jy) are of the form ((q, £),(q',£ — 1)) for some q,q' G Q 
and 0 < f < n, its adjacency matrix of G„(jz/) is (#<5, 1) -upper-diagonal (where each 
block corresponds to a set of nodes with the same second component). 



Lemma 3. For a fixed automaton s^, given Gn{jA) as input, the computation of 
uj{qo, n) belongs to NC^. 

Proof. Let M be the adjacency matrix of Gn {■A) . Recall that for every v = {q,£) G 14, 
and 0 < m < £ there exists exactly one node that can be reached from w by a path of 
lenght m, hence the row corresponding to v in contains exactly one 1. Hence, for 
0 < i < n — 1, all the nodes {qi,n — i) reachable from (qo,n) by a path of length i can 
be computed in parallel as in Lemma 2. □ 

Now let us describe the probabilistic boolean circuit c„ which on input 1" computes 
a word in L{s/) n LI" under uniform distribution. First the circuit computes in parallel 
all the coefficients r]{q, £) for g G Q and 0 < £ < n. This computation belongs to DIV 
as proven in [2]. Then the circuit computes the graph Gn{A) by generating in parallel 
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all random symbols cr^ for v G V„. As shown in Example 1, this step can be executed 
in O(logn) depth so that, for each v G Vn, Pr{cr^ = _L} < 2 “(^+riog("#Q)l) and 
hence, the probability that = _L for some t; G is at most 1 /4. Thus, if all labels of 
G„{£/) are in E the circuit outputs the string uj{qo, n) computed in 0(log n) depth as 
shown in Lemma 3; in this case, by Lemma 1, the distribution of the output is uniform. 
Otherwise, if = _L for some v G Vn, the circuit outputs _L. This proves the following 



Theorem 1. For every reeular laneuase, the uniform random veneration problem be- 
longs to RNCg(DIV). 



4 Context Free Languages 

In this section we study the uniform random generation problem for context-free lan- 
guages. We first show that for unambiguous c.f languages the problem is RNCj-re- 
ducible to L* U DIV. Then we prove that, for all inherently ambiguous c.f languages 
having polynomial ambiguity degree, the problem is RNCg-reducible to #SAC^ and 
hence belongs to RNC^. 



4.1 Unambiguous Context-Free Languages 



Let = {N,E,S,P) be an unambiguous c.f grammar in Chomsky normal form 
without useless variables, where N is the set of variables, E the set of terminals, S the 
initial variable and P the set of productions. For every A G N and every 1 < f < n, 
define t]{A,£) as the number of derivation trees of rooted at A and deriving a word in 
E^. Moreover, let = {x G : A 4> x}; since is unambiguous, r]{A, £) = 

As in the regular language case, we start by defining a family of (random) graphs 
which allows to design the circuits for solving our problem. For every integer n > 0, 
define the (direct acyclic) graph G„($#) = (U„, i?„) such that Vn = {(A, r,s) : A G 
N,1 < r < s < n} LI {{a,r) : a G E,1 < r < n} and En is built according to the 
following procedure: 



- for each v 
P 



{A, r, r) G Vn, pick G P at random such that, for every (A— >ct) G 



Pr{pv = (A^cr)} 



1 

r?(A, 1) 



and add to the edge ((A, r, r), {a, r)); 

- for each v = (A, r,s) G Vn with s > r, pick py G P x {1, . . . , s — r} at random 
such that, for every (A^BC) G P and 1 < fc < s — r. 



Pr{p„ = (A^BC,k)} 



rj{B, k)rj{C, s — r + 1 — k) 
r]{A, s — r -F 1) 



and add to the edges ((A, r, s), {B, r,r + k — 1)) and ((A, r, s), {C, r -G k, s)). 
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Clearly Gn{^) is acyclic, all its nodes {A, r, s) C Vn with s > r have out-degree 2, 
and the subgraph of Gn{^) induced by the set of nodes reachable from any {A, r, s) is 
a binary tree with s — r -|- 1 leaves of the form (cr, r) G C„. Let oj{A, r,s) = ar - • • (Js, 
where the nodes {(Ti,i), ior r < i < s, are the leaves of the subtree of Gn{^) rooted at 
{A, r, s). Reasoning by induction on 1 < f < n, one can prove that for every ^ 
and every x G Lj^, ifl<r<s<n and s — r + 1 = £, then Pr{o;(A, r, s) = x} = 
1 /t]{A, £). Asa consequence, we obtain the following 

Lemma 4. For every n > 0 such that L{^) n 27" ^ 0, 

Pr{“<*'l-“) = *) = i(LW7T^- 

for every x G L{f^) n 27”. 

We now consider the problem of computing w(S', 1, n). 

Lemma 5. Let = (TV, 27, S', P) he a fixed unambiguous c.f grammar in Chom- 
sky normal form without useless variables. Given G„ (fS^ as input, the computation 
ofbj{S, 1, n) belongs to L*. 

Proof. First observe that every {A, r,s) G Vn with r < s has only two out-neighbours 
{B,r,r+k—l) and {C,r+k, s), for some 1 < k < s — r and some B,C G N; hence, 
for every r < i < s, a node (cr, i) is reachable from {A, r, s) iff it is reachable either 
from {B, r,r + k — 1) in the case i < r + k, or from (C, r + k,s) otherwise. Thus a 
log-space bounded deterministic Turing machine can be designed which tests whether a 
node (cr, i) G Vn is reachable from (S, 1, n). Then the word w(S, 1, n) can be computed 
by testing in parallel the reachability of (ct, i) from (S, 1, n) for all 1 < T < n and all 
CT G 27. □ 

Now, reasoning as in Section 3, a probabilistic boolean circuit can be designed 
which, on input 1", first computes in parallel all the coefficients r]{A,r,s), then de- 
termines the graph Gn{1^) and finally it generates the string a;(5', 1, n). The first step 
can be done in DIV [2] while the last one is in L* as shown in Lemma 5. This, together 
with Lemma 4, yields the following 

Theorem 2. For every unambiguous context-free language, the uniform random gen- 
eration problem belongs to RNCg(DIV U L). 

4.2 Polynomially Ambiguous Context-Free Languages 

In this section we study the uniform random generation problem for inherently am- 
biguous context-free languages. Let = (TV, 27, S, P) be a c.f grammar in Chomsky 
normal form without useless variables; for every x G 27*, we denote by amb^(a;) the 
ambiguity of x, i.e., the number of derivation trees of x in . We call ambiguity degree 
of the function : N N defined by d>^{n) = max{amb^(x) : x G 27"}, for 
every n G N. Then, is said polynomially ambiguous if, for some polynomial p{n), 
we have d<^{n) < p{n) for every n > 0. 
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One can easily prove that if ^ is an ambiguous c.f grammar the circuit designed for 
Theorem 2, on input 1", gives output such that Pr{a;„ = _L} < 1/4 and, for every 

a; G r” 

nr I /II amb^(a;) 

Pr{w„ = X ^ irTT’ 



the main change, in this case, is that r/(A, t) and different. 

In order to obtain the uniform distribution we use a “rejection method” [15], giving 
a parallel version of a procedure described in [4]. Assume now that ^ is polynomially 
ambiguous and let p{n) be a polynomial such that dcg{n) < p{n) for every n > 0. A 
probabilistic boolean circuit can be designed which on input 1" first computes m = 
p{n)l and then executes 4 • p{n) times in parallel (and independently of one another) 
the following computation: 



- y = ±; 

- generate at random in H 27" according to the distribution given by (1); 

- if LOn ^ -L, then 

compute a = ambg'(w„); 

generate r uniformly at random in {1, . . . ^ }; 

if a • r < m then y = 

- return y. 

Then the circuit outputs _L if all the 4 • p{n) computations return _L, otherwise it outputs 
the first y ^ _L. Reasoning as in [4], it can be proven that the probability of getting _L 
is at most 1/4, otherwise, the output is distributed uniformly at random in L{f^) n 27". 
Evaluating the complexity of the circuit, we observe that the computation of ambg>(a;) 
for all x G 27* belongs to #SAC^ [21]. Hence, since both L and DIV are included in 
#SAC^, we obtain the following 

Theorem 3. For every language generated by a polynomially ambiguous context-free 
grammar, the uniform random generation problem belongs to RNCg(^SAC^). 



5 One-Way Nondeterministic Auxiliary Pushdown Automata 

In this section we describe a family of probabilistic boolean circuits to solve our prob- 
lem in the case of languages accepted by one-way nondeterministic auxiliary pushdown 
automata (1-NAuxPDA, for short). These circuits are based on the computation of the 
ambiguity of terminal strings with respect to different c.f grammars. For this reason we 
first study the problem of computing the value ambg>(a;) having in input a c.f grammar 
in Chomsky normal form and a word a; G 27*. 



5.1 The General Amhiguity Problem 

We start by recalling a result given in [18] to evaluate arithmetic circuits of size n 
and degree d in 0(lognlog(n(i)) parallel time (see also [16]). Here, by arithmetic 
circuit over a semiring R we mean a labelled directed acyclic graph with three kinds of 
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vertices: input nodes of fan-in 0 with labels in R, addition nodes of fan-in greater than 
1 labelled by -I-, and multiplication nodes of fan-in 2 labelled by x; we also assume 
that there is no edge between two multiplication nodes. The degree of the circuit is 
the maximum degree of its nodes, defined by induction as follows: every input node 
has degree 1, the degree of every multiplication node is the sum of the degrees of its 
two inputs and the degree of every addition node is the maximum of the degrees of its 
inputs. The value of a node can be defined in the standard way: all input nodes take as 
value their labels, the value of an addition (multiplication) node is the sum (product) of 
the values of its inputs. 

Proposition 1 ([18]). The values of all nodes in any arithmetic circuit over R of size n 
and degree d can be computed in 0(lognlog(jid)) parallel time using M{n) proces- 
sors, where M (n) is the number of processors required to multiply two n x n matrices 
over R in 0(log ri) time. 

Now, in order to compute ambg>(a;) on input ^ = {N, S, S, P) and a; = cricr 2 • • • 
CT„ C Z'" we define an arithmetic circuit Cflf ,x) on N implementing a counting ver- 
sion of the traditional CYK algorithm. The input nodes of Cflf , x) are {A, i, i), where 
AGN,l<i<n and they are labelled by 1 if {A^ai) G P and 0 otherwise. Ad- 
dition nodes are (A,i,j) with AGN,l<i<j<n, and multiplication nodes are 
(B,C,i,k,j) with (D^BC) £ P for some DGN,l<i<k<j<n. The in- 
puts of every addition node {A, i, j) are the nodes {B, C, i, k, j) such that (A^BC) G 
P] the inputs of every multiplication node {B, C, i, k,j) are the nodes {B, i, k) and 
(C, k 1, j). It is easy to show that the value ofnode (S', 1, n) is ambg>(a;). 

Lemma 6. The problem of computing amb^ (x) given as input a terminal string x and 
a context-free grammar^ in Chomsky normal form, can be solved by a uniform family 
of boolean circuits of size and 0((logn -I- logm)^) depth, where n = |x| 

and m is the size oflf. 

Proof (sketch). We observe that Proposition 1 is based on a parallel algorithm which, 
for an input arithmetic circuit of size n and degree d, executes 0(log hd) times a cycle 
of operations, the most expensive one being the product of two h x h matrices over R. 

In our case, h = 0(n^ ■ m), d = n and the value of the nodes is bounded by m^”. 
Hence, the above matrix product can be computed by a boolean circuit of polynomial 
size and 0(log n -\- log m) depth. □ 

Using the same approach, one can compute on input ^ = {N, B, S, P), A G N 
and f > 0, the number q<^(A, f) of derivation trees of 1# rooted at A and deriving a 
word in It is sufficient to map all terminal symbols a G S into the unique symbol 
z, so defining a new c.f grammar = (iV, {z}, A, P'), where P' is obtained from 
P by replacing all productions (B^a) G P with B^z and labelling every input node 
(B,i,i) of the circuit C{1f^,z^) with the cardinality of {(H— >cr) G P : a G E}. 
Hence, rj'^{A, t) = amb^j^ (z^) and the computation can be carried out as in Lemma 6. 

This allows to apply the approach presented in Section 4.1 to generate uniformly 
at random a word of length n, according to the distribution given in (1), assuming the 
grammar as a part of the input. 
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5.2 Polynomially Ambiguous 1-NAuxPDA 

We recall that a 1-NAuxPDA is a nondeterministic Turing machine having a one-way 
read-only input tape, a pushdown tape and a log-space bounded two-way read-write 
work tape [6, 5]. It is known that the class of languages accepted by 1-NAuxPDA 
working in polynomial time coincides with the class of decision problems reducible 
to context-free recognition via one-way log-space reduction [17]. 

Given a 1-NAuxPDA we denote by amb^(x) the number of accepting com- 
putations of ^ on input x G S*, and call ambiguity degree of ^ the function : 
N N defined by d^{n) = max{amb^(a;) : x G for every n G N. Then, ^ 
is said polynomially ambiguous if, for some polynomial p(n), we have d^{n) < p(n) 
for every n > 0. 

It is known that, if ^ works in polynomial time, given an integer input n > 0, a c.f 
grammar in Chomsky normal form, of size polynomial in n, can be built such that 
L{^n) n X'" = L{^) n Z'" [6]. This construction can be refined in such a way that the 
ambiguity degree of does not increase with respect to the ambiguity degree of 
i.e., for every n G N, the number of derivation trees of any word x G S" in is less 
or equal to the number of accepting computations of ^ on input x [4]. Moreover, the 
problem of computing such a refined on input 1" belongs to NC^ as shown in [1]. 

Therefore, the random generation problem for the language accepted by a polyno- 
mial time ^ is reduced to generating words of length n from the grammar uni- 
formly at random. This can be done by a general version of the algorithm described in 
Subsection 4.2 where the c.f grammar is part of the input. Thus, if the ambiguity of 
^ is polynomial, by Lemma 6, the overall computation can be carried out in 0(log^ n) 
depth and polynomial size. 

Theorem 4. For every language accepted by a polynomially ambiguous 1-NAuxPDA 
working in polynomial time, the uniform random generation problem belongs to RNC^. 



6 The General Case for Regular Languages 

In this section we consider the random generation problem for regular languages as- 
suming as input both the length of the word to be generated and the deterministic finite 
automaton recognising the language. Using the same notation of Section 3, we say that 
a family of probabilistic boolean circuits {cn,m}n,m>o solves the general problem of 
uniform random generation for regular languages, if each Cn,m, having in input 1" and 
a deterministic finite automaton of size m, computes a value in 27" U {_L} such 

that, if L{s!/) n 27" ^ 0, then: 

1. Pr{u;„,m = -L} < 1/4, 

2. Prjwn^m = X I uJn,m 7 ^ -L} = l/#(L(.c/) H 27"), for every X G L{s!/) n 27". 

The problem can be solved by a family of circuits designed as in Section 3 to gener- 
ate a word uniformly at random from a fixed regular language. Here, there are two main 
differences. First of all, since s/ = (27, Q, qo, F, 6) is part of the input, the coefficients 
r]{q,i) for <7 G Q and 0 < £ < n can be computed in DET (rather than in DIV), because 
such task is reducible to computing the £-th power of a m x m integer matrix. Second, 
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once the graph Gn{'S^) is obtained, the computation of uj{qo, n) belongs to L* (rather 
than NC^) since it is reducible to a reachability problem in a direct acyclic graph whose 
nodes have out-degree at most 1 [8]. Hence, we obtain the following 

Theorem 5. The general problem of uniform random generation for regular languages 
is solved by a uniform family of probabilistic boolean circuits of polynomial size and 
0(log(n + m)) depth with oracle nodes in DET. 



7 Concluding Remarks 

In this paper we have studied the circuit complexity of the uniform random generation 
problem for several classical formal languages. An interesting application of the results 
presented here is related to counting problems, i.e. computing #(L n 17”) on input 
n > 0. 

It is well-known that random generation is related to counting and that there are 
cases in which exact counting is hard, while the random uniform generation is easy 
and allows to obtain approximation schemes for the counting problem [14, 13]. This is 
for instance the case for some finitely ambiguous context-free languages, as discussed 
in [3, 4]. In a forthcoming paper we will show that a RNC^ approximation scheme can 
be designed for the counting problem of every language accepted by a polynomial time 
1-NAuxPDA of polynomially bounded ambiguity. 
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Abstract. We consider the following problem: Given a subset S of size 
n of a universe {0, . . . , m — 1}, construct a minimal perfect hash function 
for S, i.e., a bijection h from S to {0, ...,n — 1}. The parameters of 
interest are the space needed to store h, its evaluation time, and the 
time required to compute h from S. The number of bits needed for the 
representation of h, ignoring the other parameters, has been thoroughly 
studied and is known to be nloge + log log rt ± O(logn), where “log” 
denotes the binary logarithm. A construction by Schmidt and Siegel uses 
0{n + log log m) bits and offers constant evaluation time, but the time 
to find h is not discussed. We present a simple randomized scheme that 
uses n log e + log log u + o(n + log log u) bits and has constant evaluation 
time and 0(n + log log rt) expected construction time. 

Keywords: Gomputational and structural complexity, algorithms and 
data structures, perfect hashing, sparse tables, space complexity. 



1 Introduction 

Suppose that S' is a subset of size n of the universe {0, . . . ,m — 1} for some 
n, u G IN = {1, 2, . . .}. A function h defined on {0, . . . , m— 1} is said to be perfect 
for S if it is injective on S. If, moreover, the range of h is the set {0, . . . , n — 1}, 
then h is called a minimal perfect hash function for S. We consider the problem 
of constructing minimal perfect hash functions for given sets of nonnegative 
integers. 

Let A be an algorithm that inputs an arbitrary set S of nonnegative inte- 
gers and outputs a minimal perfect hash function h for S. Several performance 
parameters of A are of interest: 

— Encoding size: The number of bits of storage occupied by the representation 
of h output by A. 

— Evaluation time: The time needed to compute h{x) for an arbitrary x in the 
domain of h. 

— Construction time: The time needed to compute h from S. 

— Working space: The amount of space needed to compute h from S. 

We view these parameters as functions of n = [S'! and u = 1 -I- max S'. Fred- 
man, Komlos and Szemeredi described a randomized construction that achieves 
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O(nlog'u) encoding size, 0(1) evaluation size and 0(n) expected construction 
time [4]. Strictly speaking, their scheme yields a function h whose range is of 
size 0(n) rather than n, but is it easy to obtain a minimal perfect hash func- 
tion within the same resource bounds. Using a counting argument, Fredman and 
Komlos proved a worst-case lower bound of n log e -I- log log u—O (log n) bits for 
the encoding size of a minimal perfect hash function for a subset of size n of a 
universe of size u, provided that u > for some fixed e > 0 [3] (an easy alter- 
native proof was given by Radhakrishnan [10]). That this bound is almost tight 
follows by comparing it with an upper bound of nloge -I- log log u -I- O(logn) 
bits given by Mehlhorn [8, Sect. III. 2. 3, Thm. 8]. His construction, however, 
has tiiogu) construction and evaluation time. Schmidt and Siegel showed 

the existence of minimal perfect hash functions combining an encoding size of 
©(n-l-loglogu) bits with 0(1) evaluation time, but the time needed to find such 
functions was not discussed [11]. We present a new construction that not only 
works in almost linear expected time while still offering constant-time evalua- 
tion, but also reduces the encoding size to within lower-order terms of the lower 
bound. 

Our model of computation is a unit-cost word RAM [5] with an instruction 
set including multiplication and integer division. We denote the word length 
of the machine by w and assume that every input set S consists of numbers 
representable in single words, i.e., max S' < 2“. We will measure the encoding 
size of a hash function in bits, but the working space needed for its construction 
in rc-bit words. Our main result is expressed in the following theorem. 

Theorem 1. For all integers n,u,w > 4 with u < 2“ and for every given subset 
S of size n of {0, ... ,u — 1}, a minimal perfect hash function for S that can he 
evaluated in 0(1) time and stored in n log e -I- log log u + 0(n(log log n)^/log n + 
log log log u) bits can be constructed in 0(n -I- log log u) expected time using 0(n) 
words of working space on a unit-cost word RAM with a word length of w bits 
and an instruction set including multiplication and integer division. 

Our approach is very simple. Suppose that we are given an input set S 
of size n. Repeatedly replacing the elements of S by their remainders modulo 
suitable primes, we begin by mapping S bijectively to a set S' whose elements 
are either bounded by a polynomial in n or far smaller than max S'. In the 
former and more interesting case, we proceed to partition S' into groups of 
elements small enough to be handled by the doubly exponential algorithm of 
Mehlhorn mentioned above. The division into groups is done in two stages, each 
of which defines a group as the set of elements mapped to a common value 
by a suitable hash function, a so-called bucket of the hash function. The hash 
functions employed for this purpose have to be chosen rather carefully, as the 
maximum bucket size must be within a constant factor of the average bucket size. 
Essential in achieving a construction time that is linear and not merely almost 
linear in n is the observation that the superlinear component in the running time 
of Mehlhorn’s algorithm can be amortized over all groups. 
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2 Reducing the Size of the Universe 

We denote by the term range reduction the process of reducing an instance of 
the problem at hand, namely computing a minimal perfect hash function for a 
given set S, to an instance that involves smaller input numbers, i.e., the process 
of reducing the size of the universe. We employ a range reduction based on the 
following lemma, proved essentially as [4, Lemma 2] . 

Lemma 2. There is a constant /3 G IN such that for every nonempty set S 
of nonnegative integers and for every m > /3|5'plog(l + max S'), the function 
X X mod p is injective on S for at least half of the primes p hounded by m. 

Let S be an input set and take n = |S|, m = 1 + maxS, A = j3n^\\ogu'\ 
and iA={pGlN|p<A and the function x ^ x mod p is injective on S}. We 
assume that n > 4. In order to put Lemma 2 to use, we need a way to compute 
an element of D. 

If log u and therefore A are polynomial in n, we can pick an integer p uniformly 
at random from M = {1, . . . , A}, apply Rabin’s randomized primality test [9] 
to p [log n] times and, if p passes this test — which happens with probability at 
most 1/n if p is composite — proceed to test directly whether p G D hy means 
of radix sorting. If p fails any test, we immediately discard it and pick a new 
random integer, continuing until an element of D is encountered. By Lemma 2, 
the expected number of trials in which p is prime is bounded by a constant, and 
the expected time spent in such trials is 0{n). By the prime number theorem, 
the density of primes in M is l7(l/logA) = I2(l/logn), so that the expected 
total number of trials is O(logn). Since Rabin’s test works in (logn)*^^^^ time, 
the total expected time is 0{n). 

For logu > n^, we sketch a different method and allow an expected time of 
0{n + log A) = 0{n + log log u). Note first that A can be computed within this 
time bound. We pick a set R of [log A] random elements of M and store these, 
each replicated ['\/AJ times, together in a single computer word A. The condition 
logu > ensures that the word length is sufficient for this to be possible 
(unless u is smaller than some constant). We also create a word B containing 
the sequence 1, . . . , [‘\/AJ, replicated [log A] times, and proceed to divide each 
number in A by the corresponding number in B. Simulating the school method 
for long division, this can be carried out simultaneously for all pairs of numbers 
in 0(log A) time; a more detailed discussion of similar computations can be found 
in [5] . As a result, we learn for each element of R whether it has a divisor bounded 
by ["v/AJ, i.e., whether it is composite. The set R was chosen sufficiently large 
to ensure that with probability 12(1) it contains at least one prime. If this is the 
case, we pick such a prime p and test whether p G D. Because p is much smaller 
than u, this can be done in 0(n) time by sorting [5]; alternatively, it can be done 
in 0(n) expected time using universal hashing [1]. If no element of D is found, 
we repeat the entire procedure. Since each trial takes 0{n + log log u) time and 
succeeds with probability 12(1), the overall expected time is 0{n + log log m). 

Faced with an input set S with [S'! = n, we repeatedly apply the reduction 
based on Lemma 2 and discussed above until we reach a set S' with max S' < n^, 
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but at most four times. The expected time to do this is 0{n + log log m), and 
0(n) words of working space suffice. The first reduction requires a prime of at 
most log A = log log U+ O(logn) bits to be stored as part of the representation 
of the final minimal perfect hash function. The number of bits required for all 
following reductions is 0(log n + log log log u) . 

After four reduction steps, we have a set S' with maxS" = 0{n? n + 
log^"^^w)). If the condition maxS" < is still not satisfied, n = O(log^^^'u), and 
we can simply store S' using the method of Fredman, Komlos and Szemeredi [4] , 
which requires 0(n log maxS") = 0((log^^^'u)^) bits of storage, for a total of 
log log u + o(log log log u) bits. In the following, we can therefore assume without 
loss of generality that the input set S satisfies maxS < n^. 

3 Splitting into Groups 

Our goal in this section is to partition the set S into 0{n/h) groups of at most h 
elements each, where n = 7 log n /log log n for a constant 7 > 0 to be chosen later. 
Our main tool is a class TZ of hash functions introduced by Dietzfelbinger and 
Meyer auf der Heide [2] (another possibility would be to use a class defined by 
Siegel [12]). For our purposes, the distinguishing feature of TZ is that a function 
drawn at random from TZ is likely to spread a key set about evenly over its range. 

We begin by defining the class TZ. Fix a prime p > u, let U = {0, . . . ,p — 1} 
and, for d, s e IN, take 

Tff = {/lo I a = (oo, . . . , Od) G 

where, for a = (oq, . . . , ad) G /lo : Cf ^ {0, . . . , s — 1} is the function given 

by 

f . \ 

ha{x) = E aix' mod p ] mod s 
\i=0 / 

for all X G U. Informally, 7d/ is known as the class of polynomials of degree d. 
The class TZ depends on four parameters r,s,di,d 2 G IN, a dependence made 
explicit by writing TZ as TZ{r, s, di, d2)- For r, s, di, d2 G IN, 

TZ{r,s,di,d 2 ) = {d(/,g_ao,....a,_i) I / e Hr\g G and 0 < oq, . . . , 0^-1 < s}, 

where, for / G \ g G Tif and oq, . . . , a^-i G {0, . . . , s - 1}, h^f^g^ao,...,ar-i) ■ 
U ^ {0, ... ,s — 1} is the function given by 

h(f^g^ao,...,ar-i){x) = {g{x) + 0/(a:)) mod S, 

for all X G U. One way to visualize TZ is as follows: A key x G U is first mapped 
to row f{x) and column g(x) of an r x s table. Then row i is rotated cyclically 
a distance of Oj, for i = 0, . . . ,r — 1, and the resulting column number is taken 
as the final function value. 

The nontrivial fact about TZ of interest to us is expressed in the following 
lemma, related to Lemma 4.4 and Theorem 4.6 of [2]. 
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Lemma 3. For every c > 0, there is a C > 0 such that for all r,s,di,d 2 G IN 
with r <n, s < cn/logn, rs > d\>C and c ?2 > C, the relation 

Vt G {0, . . . , s — 1} : \{x G S I h{x) = i}| < Cn/ s 

holds with probability at least 1 — n~^ if h is chosen uniformly at random from 
TZ{r,s,di,d2). 



Informally, the lemma says that if r and s are chosen so that rs = 
for some fixed e > 0 and s = 0(n/log n), then for sufficiently large di and d 2 , the 
maximum bucket size of a random function from TZ{r, s, d\, ^ 2 ) will be within a 
constant factor of the average bucket size, except with negligible probability. 

We prove Lemma 3 using several auxiliary lemmas. Note that we can assume 
n to be larger an arbitrary constant, since the maximum bucket size is trivially 
bounded by Cn/s if C > kn. In particular, we assume that s < n. 

Let f We begin by showing, using the following lemma, that 

if / G is chosen uniformly at random and di is sufficiently large, then the 
maximum bucket size maxo<j<r \{x G S \ f{x) = j}| of / is bounded by 2^, 
except with negligible probability. 



Lemma 4. Let n,d G TN, let Xi,...,X„ be d-independent, equidistributed 0- 
1-variables, let p, > E{Xi) and assume that np > d. Then, for some a that 
depends only on d and for every C > 0; 



Pr 






< 



a{npY/'^ 



The lemma is essentially [7, Corollary 4.20]. We generalize the original formula- 
tion in a trivial way by allowing p > E{Xi) instead of taking p = E{Xi) and 
replace the original condition n > d/(2p) by the stronger condition np > d, 
which seems called for by the proof. 

In our context, with S = {xi , . . . , x„}, we fix j G {0, . . . , r — 1} and take 

Xi = I ^ 

* \ 0, otherwise, 

for i = 1, . . . , n. Then Xi , . . . , Xn satisfy the conditions of Lemma 4 with d = d\ 
and p = 2/ r + di/n. For every d\ and for sufficiently large n, we have f > np, 
and therefore the quantity |{x G S' | f{x) = j}\ = YYi=i W is bounded by 2^, 
except with probability at most where a depends only on di. For di and 

subsequently n chosen sufficiently large, the latter probability is at most n~^, so 
maxo<j<r \{x G S | f{x) = j}| > 2^ with probability at most rn~^ < n“^. 

Assuming that / has been chosen so that its maximum bucket size is indeed 
bounded by 2^, we next show that if g G TtY Is chosen uniformly at random 
and d 2 is sufficiently large, then for each application of g to a bucket of /, the 
maximum bucket size is bounded by d 2 , except with negligible probability. 
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Lemma 5 ([2, Fact 2.2(b)]). For all m,s,d G JN and for every subset B of 
U of size m, if g is chosen uniformly at random from TCf, then maxo<i<s \{x e 
B I g{x) = i}\ < d with probability at least 1 — m • {2m/s)‘^. 

We use the lemma for j = 0, . . . , r — 1 with B = Bj = {x G S \ f{x) = j}. 
In our case, d = d 2 , and m < 2f, so that 2m/s < ^ 

Thus, if d 2 and subsequently n are chosen sufficiently large, then Pr(|{x G B \ 
g{x) = i}| > ^ 2 ) < n~^ for i = 0, . . . , s — 1 and Pr(maxo<i<s \{x G B \ g{x) = 
01 > (^ 2 ) < sn~^ < 

We now come to the proof of Lemma 3 itself. Concerning the random choice 
of h, we assume that / and g have already been selected, so that only uq, , Or-i 
remain to be chosen. Then, for each fixed i G {0 , . . . , s — 1}, the quantity Zi = 
|{x G S I h{x) = i}| is the sum of independent random variables Xq, . . . ,Xr-i 
where, for j = 0, ... ,r — 1, Xj = |{x G Bj \ h{x) = i}|. It is easy to see that 
E{Zi) = n/s. We will assume that Xj < ^2 for j = 0, . . . , r — I; by what was 
shown above, this ignores an event of negligible probability. 



Lemma 6 (Hoeffding; see [6, p. 104]). Let Z be a sum of independent non- 
negative random variables, each bounded by z > 0, and take p, = E{Z). Then, 
for all t > 0, 



Fr{Z > p-\-t) < 




Using the lemma with z = d 2 , p = n/s and t = {C — l)p, we obtain 
Pr(Zi > Cn/s) < 

For sufficiently large C, we have Fr{Zi > Cn/s) < n~^ and Pr(maxo<i<s Zi > 
Cn/s) < sn~^ < n“^. Adding the three “error” probabilities identified above, 
we see that the assertion of Lemma 3 holds with probability at least 1 — 3n“^ > 
1 — n~^ . This ends the proof of Lemma 3. 



The condition s = 0{n/logn) of Lemma 3 prevents us from splitting S 
into groups of size at most h in one go. We therefore begin by splitting S into 
groups of size 0((logn)^). We take r = 0{^/n), s = 0(n/(logn)^) and C, 
di and ^2 according to Lemma 3 (for c = 3, say) and repeatedly choose h G 
TZ{r, s,d\,d 2 ) uniformly at random until maxo<i<s \{x G S \ h{x) = i}| < Cn/s. 
By Lemma 3, the expected number of trials is 0(1), and the computation can 
be carried out in 0(n) expected time using 0(n) words of working space. By the 
assumption max S <r? , the chosen function h can be represented in 0(r log n) = 
0{ydnlogn) space, and its evaluation takes 0(1) time. 

For each of the resulting groups of size at most I = 0((logn)^), we use a 
single range reduction based on Lemma 2 to force all integers in the group below 
an integer v with v = (logn)'^*'^^. This requires the storage of one prime of 
O(loglogn) bits per group, for a total of o(n/logn) bits. 

Our remaining task is to reduce the group size further from at most I to 
at most h = ylogn/loglogn. We do this by another application of Lemma 3 
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with r = 0{{logn)^) and with s > Cl/n (ensuring that each group is split 
into subgroups of at most h elements each), but s = 0{l/h) (ensuring that the 
total number of subgroups is 0{n/fi)). The space needed for storing the required 
functions in TZ is 0(r log log n) bits per group, for a total of 0(nloglogn/logn) 
bits. 

4 Perfect Hashing by Brute Force 

In the previous section we reduced the original problem of size n to a collection 
of subproblems of size 0(logn/loglogn) and involving numbers of size polylog- 
arithmic in n. In this section we discuss how to solve a single such subproblem 
using nearly minimal space. 

Fix integers m,v >2 and let T be the set of all functions from V = 

1} to {0, . . . , TO — 1}. We call a (multi)subset F" of IF perfect if for every subset 
B of F of size to, F contains a (minimal) perfect hash function for B. For 
t e IN, denote by q{t) the probability that a multiset of t random functions 
drawn independently from the uniform distribution over F is not perfect. In the 
proof of [8, Sect. III. 2. 3, Thm. 7], Mehlhorn argues that 




for all t G IN and proves that the right-hand side is smaller than 1 for t = 
\me"^lnv~\. 

It follows that there exists a perfect set of size t*, and Mehlhorn proceeds to 
define a canonical such set F* as the first perfect set encountered in some fixed 
enumeration of the subsets of F of size t* . Because F* can be recalculated for 
every query, it need not be stored, which is crucial in the original setting. In our 
application, however, to and v are tiny, and storing a perfect set is feasible. This 
allows us to replace the deterministic procedure of [8], which runs in doubly- 
exponential time, by a randomized procedure whose running time is merely 
singly exponential. 

We first observe that since to! > (to/c)™, 

,((• + 1) < ,(i") < 1 _ e- 

It follows that if we repeatedly draw a multiset of t* -I- 1 random functions 
from F until a perfect multiset is encountered, then the expected number of 
trials is 0(e™). Moreover, each multiset can be tested for perfectness in time 
O + 1)) = 0(w’”e’”lnw), so a perfect multiset F* of t* -I- 1 functions 

from F can be found in expected time. As a by-product of the compu- 

tation, we discover for each subset B of V of size to a function in F* that is 
perfect for B. 

Lemma 7. Given integers m,v >2 and a subset B of size m of {0, . . . , w — 1}, 
yO(m) expected time and words of working space suffice to compute a 
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minimal perfect hash function for B that can be evaluated in constant time and 
whose representation consists of bits that depend only on m and v and 

mlog e + loglog ?; + 0(log m) bits that depend also on B. 

Proof. Carry out the construction described above and store the set F*, which 
depends only on m and v, as well as a pointer of log \F* \ = m log e + log log v + 
O(logm) bits to a function in F* that is perfect for B. 

5 The Complete Construction 

In Section 3 the problem of hashing the original set S was reduced to that of 
hashing groups . . . ,Gk, where \Gi\ < h and max Gi < v for i = 
and k = 0{n/h). More precisely, we showed that, within the resource bounds 
of Theorem 1, we can map S injectively to a set S of pairs (t, j) S {1, . . . , fc} x 
{0, . . . ,D — 1} such that for i = 1, . . . ,fc, \Gi\ < h, where Gi = S H ({t} x 
{0,...,D — 1}). Moreover, for a certain constant p > 1, we showed in Section 4 
how to compute a minimal perfect hash function hi for Gi in at most t)^" steps, 
for i = 1, ... ,k, such that the representation space of hi consists of at most w'’” 
bits that depend only on \Gi\ (the shared part) and \Gi \ loge + O (log log n) bits 
that depend also on Gi (the individual part). We still need to describe how to 
combine the solution for single groups to a solution for the full set S. 

Fix a constant n so that v < (logn)’' and recall that h = 7 log n/log log n for 
a constant 7 > 0 that can still be chosen freely. Now 

< 2'^'°S'°g"'P'T'log"/loglog" _ 

We choose 7 = l/(3i^p), which makes Then, since the number of 

possible distinct groups is bounded by 1+D" = 0(n^/^), we can compute minimal 
perfect hash functions for all possible groups in 0(n^/^) time. We compute a 
table mapping each possible group size to the corresponding public part and 
another table mapping each group, represented as an integer of size 0(n^/^), to 
the corresponding individual part. The space needed for these tables is negligible. 
We create another table L mapping each t S {1, . . . , fc} to |Gi|, which allows the 
public part of hi to be accessed, and to the individual part of hi. The entries in 
L can be computed in 0{n) time, and their total size is 

k 

E(IG i \ log e + O(loglogn)) = nlog e + 0(n(loglogn)^/logn) 

i=l 

bits, in accordance with Theorem 1. We still have to solve two problems, however: 

(1) We cannot use the functions hi, . . . ,hk directly, because their ranges overlap. 
Rather, we would like to replace hi by hi + l^tl> ^or i = 1, . . . ,k. 

(2) Because the entries in L are not all of the same length, it is not clear how 
to access the ith entry in constant time. 
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The following lemma provides a solution to these problems. 

Lemma 8. For all integers m,N > 4, given m integers oi, . . . , am with Q* 
< N, a data strueture that occupies 0(m(loglogm + log(l + fV/m))) bits and 
allows the computation ofbi = from i in constant time, for i = 1, . . . ,m, 

can be constructed in 0{m) time and space. 

Proof. If N > w? , we can simply store bi, ... ,bm in a table with m entries of 
[log(fV + 1)] bits each. Assume therefore that N < m? . Our data structure is 
a tree T of depth 2 with at least m leaves in which every node of depth 1 has 
d = O(logm) children and the root has 0(m/logm) children. Conceptually, we 
label the ith leaf of T, counted from the left, with Oi, for i = 1, . . . ,m, and the 
remaining leaves with zero. For every node v of T, denote by s(y) the sum of 
the labels at leaves that are descendants of left siblings of v or equal to v. For 
i = 1, . . . ,m, the prefix sum bi is the sum of s{v) over all ancestors v of the ith 
leaf of T. 

We call a leaf v good if s{v) < {N/m){logmfr , and bad otherwise. If a leaf v 
is good, we store s(v) in a field of 0(loglogm + log(l + N/m)) bits associated 
with V. Similarly, for each internal node v, we store s{v) in a field of 0(log m) bits 
associated with v. Together, these fields occupy 0(m(loglogm + log(l + iV/TO))) 
bits. 

Call a node v of depth 1 good if all of its children are good, and bad otherwise. 
For each bad node v of depth 1, we store all the values s{y), where y is a, child 
of V, in a table with d fields of 0(log m) bits each in an overflow area and store 
a pointer to this table with v. Since the number of bad nodes of depth 1 is 
bounded by m/(logm)^, an overflow area of size 0{m) suffices. Altogether, the 
space needed is 0(m(loglogm + log(l + N/m))) bits, and it is easy to see that 
bi can be computed in constant time from i for i = 1, . . . , m. 

In order to solve Problem (1), we store the groups sizes |Gi|, . . . , |Gfc| us- 
ing the method of Lemma 8. Since 1^*1 = the space needed comes to 

0(A:(loglogn -|- log(l -I- n/k))) = 0(n(loglogn)^/logn). In order to solve Prob- 
lem (2), we store the individual parts of hi, ... ,hk as one contiguous bit string 
W and store the sizes of these individual parts using the method of Lemma 8, 
which allows us to pick out any individual part from W in constant time. 
Since the total size of all individual parts is 0{n), the space needed again is 
0(n(loglogn)^/logn). This ends the proof of Theorem 1. 
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Abstract. Most known constructions of probabilistically checkable proofs 
(PCPs) either blow up the proof size by a large polynomial, or have a high (though 
constant) query complexity. In this paper we give a transformation with slightly- 
super-cubic blowup in proof size, with a low query complexity. Specifically, the 
verifier probes the proof in 16 bits and rejects every proof of a false assertion 
with probability arbitrarily close to |, while accepting corrects proofs of theo- 
rems with probability one. The proof is obtained by revisiting known construc- 
tions and improving numerous components therein. In the process we abstract a 
number of new modules that may be of use in other PCP constructions. 



1 Introduction 

Probabilistically checkable proofs (PCP) have played a major role in proving the hard- 
ness of approximation of various combinatorial optimization problems. Constructions 
of PCPs have been the subject of active research in the last ten years. In the last decade, 
there have been several “efficient” construction of PCPs which in turn have resulted in 
tighter inapproximability results. Arora et al. [1] showed that it is possible to transform 
any proof into a probabilistically checkable one of polynomial size, such that it is ver- 
ifiable with a constant number of queries. Valid proofs are accepted with probability 
one (this parameter is termed the completeness of the proof), while any purported proof 
of an invalid assertion is rejected with probability 1/2 (this parameter is the soundness 
of the proof). Neither the proof size, nor the query complexity is explicitly described 
there; however the latter is estimated to be around 10®. 

Subsequently much success has been achieved in improving the parameters of PCPs, 
constructing highly efficient proof systems either in terms of their size or their query 
complexity. The best result in terms of the former is a result of Polishchuk and Spiel- 
man [12]. They show how any proof can be transformed into a probabilistically check- 
able proof with only a mild blowup in the proof size, of for arbitrarily small e > 0 

and that is checkable with only a constant number of queries. This number of queries 
however is of the order of 0(l/e^), with the constant hidden by the big-Oh being some 
multiple of the query complexity of [1]. On the other hand, Hastad [10] has constructed 
PCPs for arbitrary NP statements where the query complexity is a mere three bits (for 
completeness almost 1 and soundness 1/2). However the blowup in the proof size of 
Hastad’s PCPs has an exponent proportional to the query complexity of the PCP of [1]. 
Thus neither of these “nearly-optimal” results provides simultaneous optimality of the 
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two parameters. It is reasonable to wonder if this ineffieiency in the combination of the 
two parameters is inherent; and our paper is motivated by this question. 

We examine the size and query complexity of PCPs jointly and obtain a construction 
with reasonable performance in both parameters. The only previous work that mentions 
the joint size vs. query complexity of PCPs is a work of Friedl and Sudan [8], who 
indicate that NP has PCPs with nearly quadratic size complexity and in which the ver- 
ifier queries the proof for 165 bits. The main technical ingredient in their proof was 
an improved analysis of the “low-degree test”. Subsequent to this work, the analysis 
of low-degree tests has been substantially improved. Raz and Safra [13] and Arora and 
Sudan [3] have given highly efficient analysis of different low-degree tests. Further- 
more, techniques available for “proof composition” have improved, as also have the 
construction for terminal “inner verifiers”. In particular, the work of Hastad [9,10], has 
significantly strengthened the ability to analyze inner verifiers used at the final compo- 
sition step of PCP constructions. 

In view of these improvements, it is natural to expect the performance of PCP con- 
structions to improve. Our work confirms this expectation. However, our work exposes 
an enormous number of complications in the natural path of improvement. We resolve 
most of these, with little loss in performance and thereby obtain the following result: 
Satisfiability has a PCP verifier that makes at most 16 oracle queries to a proof of size 
at most where n is the size of the instance of satisfiability. Satisfiable instances 

have proofs that are accepted with probability one, while unsatisfiable instances are 
accepted with probability arbitrarily close to 1/2. (See Main Theorem 1.) 

We also raise several technical questions whose positive resolution may lead to a 
PCP of nearly quadratic size and query complexity of 6. Surprisingly, no non-trivial 
limitations are known on the joint size + query complexity of PCPs. In particular, it is 
open as to whether nearly linear sized PCPs with query complexity of 3 exist for NP 
statements. 

2 Overview 

We first recall the standard definition of the class PCPc,s[r, q]. 

Definition 1. For functions r, q : a probabilistic oracle machine ( or ver- 

ifier) V is (r, q)-restricted if on input x of length n, the verifier tosses at most r{n) ran- 
dom coins and queries an oracle tt for at most q{n) bits. A language L G PCPc,s[r, q] 
if there exists an (r,q)-restricted verifier V that satisfies the following properties on 
input X. 

Completeness. If x G L then there exists tt such that V on oracle access to it accepts 

with probability at least c. 

Soundness. If x ^ L then for every oracle tt, the verifier V accepts with probability 

strictly less than s. 

While our principal interest is in the size of a PCP and not in the randomness, it is well- 
known that the size of a probabilistically checkable proof (or more precisely, the number 
of distinct queries to the oracle tt) is at most Thus the size is implicitly 

governed by the randomness and query complexity of a PCP. The main result of this 
paper is the following. 
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Main Theorem 1. For every SAT G PCP 2 ,i+^ [(3 + e) logn, 16] . 

Remark: Actually the constants e and /x above can be replaced by some o(l) functions; 
but we don’t derive them explicitly. 

It follows from the parameters that the associated proof is of size at most 0(n^+®). 
Cook [6] showed that any language in NTIME(t(n)) could be reduced to SAT in 
0{t{n) logt(n)) time such that instances of size n are mapped to Boolean formulae 
of size at most 0{t{n) logt(n)). Combining this with the Main Theorem 1, we have 
that every language in NP has a PCP with at most a slightly super-cubic blowup in 
proof size and a query complexity as low as 16 bits. 

2.1 MIP and Recursive Proof Composition 

As pointed out earlier, the parameters we seek are such that no existing proof system 
achieves them. Hence we work our way through the PCP construction of Arora et al. [ 1 ] 
and make every step as efficient as possible. The key ingredient in their construction (as 
well as most subsequent constructions) is the notion of recursive composition of proofs, 
a paradigm introduced by Arora and Safra [2]. The paradigm of recursive composition 
is best described in terms of multi-prover interactive proof systems (MIPs). 

Definition 2. For integer p, and functions r, a : 2Z^ — > , an MIP verifier V is 

(p, r, a)-restricted if it interacts with p mutually-non-interacting provers Tti, . . . ,Ttp in 
the following restricted manner. On input x of length n, V picks a random r(n)-bit 
string R and generates p queries qi, . . . ,Qp and a circuit C of size at most a(n). The 
verifier then issues query qi to prover tt^. The provers respond with answers oi, . . . , Op 
each of length at most a(n) and the verifier accepts x iff C{a\, . . . , Op) = true. A 
language L belongs to MIPc,s[p, r, a] if there exists a (p, r, a)-restricted MIP verifier 
V such that on input x: 

Completeness. If x G L then there exist tti , . . . , such that V accepts with probabil- 

ity at least c. 

Soundness. Ifx ^ L then for every tti , . . . , TTp, V accepts with probability less than s. 

It is easy to see that MIPc,s[p, r, a] is a subclass of PCPc,s[r, pa] and thus it is ben- 
eficial to show that SAT is contained in MIP with nice parameters. However, much 
stronger benefits are obtained if the containment has a small number of provers, even if 
the answer size complexity (a) is not very small. This is because the verifier’s actions 
can usually be simulated by a much more efficient verification procedure, one with 
much smaller answer size complexity, at the cost of a few more provers. Results of this 
nature are termed proof composition lemmas; and the efficient simulators of the MIP 
verification procedure are usually called “inner verification procedures”. 

The next three lemmas divide the task of proving Main Theorem 1 into smaller 
subtasks. The first gives a starting MIP for satisfiability, with 3 provers, but poly- 
logarithmic answer size. We next give the composition lemma that is used in the in- 
termediate stages. The final lemma gives our terminal composition lemma - the one 
that reduces answer sizes from some slowly growing function to a constant. 

Lemma 2. For every e,p > 0, SAT G MIPi^p [3, (3 -I- e) log n, poly log n]. 
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Lemma 2 is proven in Seet. 3. This lemma is eritieal to bounding the proof size. This 
lemma follows the proof of a similar one (the “parallelization” step) in [1]; however 
various aspeets are improved. We show how to ineorporate advances made by Pol- 
ishchuk and Spielman [12], and how to take advantage of the low-degree test of Raz 
and Safra [13]. Most importantly, we show how to save a quadratic blowup in this 
phase that would be incurred by a direct use of the parallelization step in [1]. 

The first composition lemma we use is an off-the-shelf product due to [3]. Similar 
lemmas are implicit in the works of Bellare et al. [5] and Raz and Safra [13]. 

Lemma 3 ([3]). For every e > 0 and p < oo, there exist constants ci, C 2 , C 3 such that 
for every r, a : 



MlPi^^[p,r,a] C MIPi ,.i/(2p+2)[p-|- 3 ,r-|- ciloga,C2(loga)°='] . 

The next lemma shows how to truncate the recursion. This lemma is proved in Sect. 4 
using a “Fourier-analysis” based proof, as in [9,10]. This is the first time that this style 
of analysis has been applied to MIPs with more than 2 provers. All previous analyses 
seem to have focused on composition with canonical 2-prover proof systems at the outer 
level. Our analysis reveals surprising complications and forces us to use a large number 
(seven) of extra bits to effect the truncation. 

Lemma 4. For every e > 0 and p < 00 , there exists a 7 > 0 such that for every 
r,a: 2Z+ ^ 



MIPi,^[p,r,a] CPCPi,i+Jr + 0(2n,P+7] . 

Proof (of Main Theorem 1). The proof is straightforward given the above lemmas. We 
first apply Lemma 2 to get a 3 -prover MIP for SAT, then apply Lemma 3 twice to get a 6- 
and then a 9-prover MIP for SAT. The answer size in the final stage is poly log log log n. 
Applying Lemma 4 at this stage we obtain a 16-query PCP for SAT; and the total ran- 
domness in all stages remains (3 -I- e) log n. □ 

Organization of the Paper: In Section 3, we prove Lemma 2. For this purpose, we 
present the Polynomial Constraint Satisfaction problem in Section 3.2 and discuss its 
hardness. We then discuss the Low degree Test in Section 3.3. Most aspects of the 
proofs in Section 3 are drawn from previous works of [1,3,12,13]. Hence, we abstract 
the main results in this section and leave the detailed proofs to the full version of the 
paper^ In Section 4, we present the proof of Lemma 4. In section 5 we suggest possible 
approaches for improvements in the joint size-query complexity of PCPs. 



3 A Randomness Efficient MIP for SAT 

In this section, we use the term “length-preserving reductions”, to refer to reductions 
in which the length of the target instance of the reduction is nearly-linear (0(n^“*''^) for 
arbitrarily small e) in the length of the source instance. 

* A full version of this paper can be found at ftp://ftp.eccc. uni - trier . de/pub/ 
eccc/ reports/2 0 0 O/TROO - 061/ index . html. 
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To prove membership in SAT, we first transform SAT into an algebraic problem. 
This transformation comes in two phases. First we transform it to an algebraic problem 
(that we call AP for lack of a better name) in which the constraints can be enumerated 
compactly. Then we transform it to a promise problem on polynomials, called Polyno- 
mial Constraint Satisfaction (PCS), with a large associated gap. We then show how to 
provide an MIP verifier for the PCS problem. 

Though most of these results are implicit in the literature, we find that abstract- 
ing them cleanly significantly improves the exposition of PCPs. The first problem, 
AP, could be proved to be NP-hard almost immediately, if one did not require length- 
preserving reductions. We show how the results of Polishchuk and Spielman [12] imply 
a length preserving reduction from SAT to this problem. We then reduce this problem 
to PCS. This step mimics the sum-check protocol of Lund et al. [11]. The technical im- 
portance of this intermediate step is the fact that it does not refer to “low-degree” tests 
in its analysis. Low-degree tests are primitives used to test if the function described 
by a given oracle is close to some (unknown) multivariate polynomial of low-degree. 
Low-degree tests have played a central role in the constructions of PCPs. Here we sep- 
arate (to a large extent) their role from other algebraic manipulations used to obtain 
PCPs/MIPs for SAT . 

In the final step, we show how to translate the use of state-of-the-art low-degree 
tests, in particular the test of Raz and Safra [13], in conjunction with the hardness of 
PCS to obtain a 3-prover MIP for SAT. This part follows a proof of Arora et al. [1] 
(their parallelization step); however a direct implementation would involve 6 log n ran- 
domness, or an n® blow up in the size of the proof Part of this is a cubic blow up due 
to the use of the low-degree test and we are unable to get around this part. Direct use of 
the parallelization also results in a quadratic blowup of the resulting proof We save on 
this by creating a variant of the parallelization step of [1] that uses higher dimensional 
varieties instead of 1 -dimensional ones. 



3.1 A Compactly Described Algebraic NP-Hard Problem 

Definitions. For functions m,h : TZi^ — > 2Z^ , the problem AP^./i has as its in- 
stances (1", H, T, ip, pi, . . . , pe) where: H is afield of size h{n), ip : H is a 

constant degree polynomial, T is an arbitrary function from to H and the pi ’s are 
linear maps from iT’” to H'^, for m = m{n). (T is specified by a table of values, and 
Pi’s by m X m matrices.) (1", H, T, ip,pi, . . . , pf) S AP^./i if there exists an assign- 
ment A : iJ™ — > H such that for every x G iT™, ip(T(x), A{pi{x )), . . . , A{pq{x))) = 
0 . 

The above problem is just a simple variant of standard constraint satisfaction problems, 
the only difference being that its variables and constraints are now indexed by elements 
of if’”. The only algebra in the above problem is in the fact that the functions pi, 
which dictate which variables participate in which constraint, are linear functions. The 
following statement, abstracted from [12], gives the desired hardness of AP. 

Lemma 5. There exists a constant c such that for any pair of functions m, h : — > 

7Z^ satisfying > n and SAT reduces to AP^./t 

under length preserving reductions. 




332 



Prahladh Harsha and Madhu Sudan 



We note that Szegedy [16] has given an alternate abstraction of the result of [12] which 
focuses on some different aspects and does not suffice for our purposes. 

3.2 Polynomial Constraint Satisfaction 

We next present an instance of an algebraic constraint satisfaction problem. This dif- 
fers from the previous one in that its constraints are “wider”, the relationship between 
constraints and variables that appear in it is arbitrary (and not linear), and the hardness 
is not established for arbitrary assignment functions, but only for low-degree functions. 
All the above changes only make the problem harder, so we ought to gain something 
- and we gain in the gap of the hardness. The problem is shown to be hard even if the 
goal is only to separate satisfiable instances from instances in which only e fraction of 
the constraints are satisfiable. We define this gap version of the problem first. 

Definition 4. For e : 7Z^ — > IR'*’, and m,b,q : the promise problem 

GapPCSj has as instances (1", d,k, s,TF;Ci, . . . , Ct), where d,k,s < b{n) are 

integers andW is a field of size q{n) and Cj = {Aj-, Xi'^ , ■ ■ ■ , is an algebraic con- 
straint, given by an algebraic circuit Aj of size s on k inputs and Xi'^ , ■ ■ ■ , ^ G IF™, 

for m = m{n). (1", d, fc, s, IF; Ci, . . . , Ct) is a YES instance if there exists a polyno- 
mial p : IF™ ^ IF of degree at most d such that for every j G { 1 , . . . , t }, the constraint 
Cj is satisfied by p, i.e., Aj{p{x^^'^), . . . ,p{x^^'^)) = 0. (1", d, fc, s, F; Ci, . . . , Ct) is a 
NO instance if for every polynomial p : F™ — > F of degree at most d it is the case that 
at most e(n) • t of the constraints Cj are satisfied. 



Lemma 6. There exist constants c \ , C 2 such that for every choice of functions e,m,b,q 
satisfying (6(n)/m(n))™(”)-=i > n, g(n)™(”) = q{n) > C 2 &(n)/e(n), 

SAT reduces to GapPCS^ ^ ^ under length preserving reductions. 

(The problem APm,h is used as an intermediate problem in the reduction. However we 
don’t mention this in the lemma, since the choice of parameters m, h may confuse the 
statement further.) The proof of this lemma is inspired by the sum-check protocol of 
Lund et al. [11] while the specific steps in our proof follow the proof in Sudan [15]. 

3.3 Low-Degree Tests 

Using GapPGS it is easy to produce a simple probabilistically checkable proof for SAT. 
Given an instance of SAT, reduce it to an instance 2 of GapPGS ; and provide as proof 
the polynomial p : F™ — > F as a table of values. To verify correctness a verifier first 
“checks” that p is close to some polynomial and then verifies that a random constraint 
Cj is satisfied by p. Low-degree tests are procedures designed to address the first part 
of this verification step - i.e., to verify that an arbitrary function / : F™ ^ F is close 
to some (unknown) polynomial p of degree d. 

Low-degree tests have been a subject of much research in the context of program 
checking and PCPs. For our purposes, we need tests that have very low probability of 
error. Two such tests with analyses are known, one due to Raz and Safra [13] and an- 
other due to Rubinfeld and Sudan [14] (with low-error analysis by Aroraand Sudan [3]) 
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For our purposes the test of Raz and Safra is more efficient. We describe their results 
first and then compare its utility with the result in [3]. 

A plane in F™ is a collection of points parametrized by two variables. Specifically, 
givena, &, c e F™ the plane pa,b,c = ^ 2 ) = a + hb + t 2 c\ti,t 2 G F}. 

Several parameterizations are possible for a given plane. We assume some canonical 
one is fixed for every plane, and thus the plane is equivalent to the set of points it 
contains. The low-degree test uses the fact that for any polynomial p : F™ ^ F of 
degree d, the function pp : F^ ^ F given by Pp(ti, ^ 2 ) = p(p(^i, ^ 2 )) is a bivariate 
polynomial of degree d. The verifier tests this property for a function / by picking a 
random plane through F™ and verifying that there exists a bivariate polynomial that 
has good agreement with / restricted to this plane. The verifier expects an auxiliary 
oracle /planes that gives such a bivariate polynomial for every plane. This motivates the 
test below. 

Low-Degree Test (Plane-Point Test) 

Input: A function / : F™ ^ F and an oracle /planes, which for each plane in F™ 

gives a bivariate degree d polynomial. 

1 . Choose a random point in the space x Gu F*”. 

2. Choose a random plane p passing through x in F™. 

3. Query /planes on p to obtain the polynomial hp. Query / on x. 

4. Accept iff the value of the polynomial hp at x agrees with f{x). 

It is clear that if / is a degree d polynomial, then there exists an oracle /planes such 
that the above test accepts with probability 1 . It is non-trivial to prove any converse and 
Raz and Safra give a strikingly strong converse, (see Theorem 7) 

First some more notation. Let (x, p) denote the outcome of the above 

test on oracle access to / and /planes- Let f,g : F™ — > F have agreement S if 
Pr^(zw^[f{x) = g{x)\ = 5. 

Theorem 7. There exist constants cq , ci such that for every positive realS, integers m, d 
and field W satisfying |F| > cod(m/(5)°L the following holds: Fix / : F™ — > F and 
/planes- Let {Pi, . . . , Pi} be the set of all m-variate polynomials of degree d that have 
agreement at least 512 with the function f : F™ ^ F. Then 

Pr[/(x) ^ {Pi(x), . . . ,P[{x)} fln<7LDT'^’'^p‘“*°(x, p) = accept] < b. 

x,p 

Remarks: 

1. The actual theorem statement of Raz and Safra differs in a few aspects. The main dif- 
ference being that the exact bound on the agreement probability described is different; 
and the fact that the claim may only say that if the low-degree test passes with probabil- 
ity greater than 5, then there exists some polynomial that agrees with / in some fraction 
of the points. The full version of this paper will include a proof of the above theorem 
from the statement of Raz and Safra. 

2. The cubic blowup in our proof size occurs from the oracle /planes which has size 
cubic in the size of the oracle /. A possible way to make the proof shorter would be to 
use an oracle for / restricted only to lines, (i.e., an analogous line-point test to the above 
test) The analysis of [3] does apply to such a test. However they require the field size 
to be (at least) a fourth power of the degree; and this results in a blowup in the proof to 
(at least) an eighth power. Note that the above theorem only needs a linear relationship 
between the degree and the field size. 
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3.4 Putting them Together 

As pointed out earlier a simple PCP for GapPCS can be constructed based on the low- 
degree test above. A proof would be an oracle / representing the polynomial and the 
auxiliary oracle /planes- The verifier performs a low-degree test on / and then picks a 
random constraint Cj and verifies that Cj is satisfied by the assignment /. But the naive 
implementation would make k queries to the oracle / and this is too many queries. The 
same problem was faced by Arora et al. [1] who solved it by running a curve through the 
k points and then asking a new oracle /curves to return the value of / restricted to this 
curve. This solution cuts down the number of queries to 3, but the analysis of correctness 
works only if |F| > kd. In our case, this would impose an additional quadratic blowup 
in the proof size and we would like to avoid this. We do so by picking r-dimensional 
varieties (algebraic surfaces) that pass through the given k points. This cuts down the 
degree to rk^/"^ . However some additional complications arise: The variety needs to 
pass through many random points, but not at the expense of too much randomness. We 
deal with these issues below. 

A variety V : F*" — > F™ is a collection of m functions, V = (Vi, . . . , Vm), Vi '■ 
F*^ ^ F. A variety is of degree D if all the functions Vi, . . . , Vm are polynomials of 
degree D. For a variety V and function / : F™ ^ F, the restriction of / to V is the 
function /|v : F*" ^ F given by /|v(ai, . . . , Or) = /(V(ai, . . . , Oc)). Note that the 
restriction of a degree d polynomial p : F™ — > F to an r-dimensional variety V of 
degree D is an r-variate polynomial of degree Dd. 

Let S C F be of cardinality k^^^ . Let Z\, . ■ ■ ,Zk be some canonical ordering of 
the points in S'" . Let : F*" — > F™ denote a canonical variety of degree 

rjS'l that satisfies every i G {1, . . . , /c}. Let Z 5 : F*" ^ F 

be the function given by Zsiyi, ■■■,yr) = OLi ]laes(yi “ i-®- ^s{zz) = 0 . 

Let a = (ai, . . . , G F™. Let be the variety (aiZg, . . . , amZs)- We will 
let Vs^a,xi,...,xk be the variety Note that if a is chosen at random, 

Vs,a,xi,...,Xk{zi) = Xi for Zi G S'" and Vs,a,xt,...,xk{z) is distributed uniformly over 
F™ if z G (F — S')’’. These varieties will replace the role of the curves of [1]. We 
note that Dinur et al. also use higher dimensional varieties in the proof of PCP-related 
theorems [7]. Their use of varieties is for purposes quite different from ours. 

We are now ready to describe the MIP verifier for GapPCS^ ^ f, (Henceforth, we 
shall assume that t, the number of constraints in GapPGS^ ^ ^ instance is at most 

In fact, for our reduction from SAT (Lemma 6 ), t is exactly equal to q™.) 

MIP VerifierT/pi“»=> /varieties d,k,s,W;Ci, . ■ ■ , Ct). 

Notation: r is a parameter to be specified. Let S' C F be such that |S| = k^^'' . 

1. Pick a,b,cG F™ and z G (F — S)’’ at random. 

2. Let p = pa,b,c- Use b, c to compute j G {1, . . . , t} at random (i.e., j is fixed 
given b, c, but is distributed uniformly when b and c are random.) Compute a 
such that V(z) = a for V = 

3. Query /(a), /pi 

anes ip) and ^varieties (V). 

Let g = /planes(p) aud h = /varieties (P)- 

4. Accept if all the conditions below are true: 

(a) g and / agree at a. 
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(b) h and / agree at a. 

(c) Aj accepts the inputs h{zi ), . . . , h{zk). 

Complexity: Clearly the verifier V makes exactly 3 queries. Also, exactly 3m log q + 
r log q random bits are used by the verifier. The answer sizes are at most 0{{drk^^^ + 
rY logq) bits. 

Now to prove the correctness of the verifier. Clearly, if the input instance is a YES 
instance then there exists a polynomial P of degree d that satisfies all the constraints 
of the input instance. Choosing f = P and constructing /planes and /varieties to be 
restrictions of P to the respective planes and varieties, we notice that the MIP verifier 
accepts with probability one. We now bound the soundness of the verifier. 

Claim 8. Let S be any constant that satisfies the conditions of Theorem 7 and S > 
where q = |IF|. Then the soundness of the MIP Verifier is at most 5 + 4e/<5 + 
Lrkrd/6{q — k^). 

Proof Let Pi , . . . , P; be all the polynomials of degree d that have agreement at least 
5/2 with /. (Note / < 4/5 since 5/2 > 2yjd/q) Now suppose, the MIP Verifier had 
accepted a NO instance. Then one of the following events must have taken place. 
Event 1: /(a) ^ {Pi (a), ..., Pi(a)} and p) = accept. 

We have from Theorem 7, that Event 1 could have happened with probability at most 5. 
Event 2: There exists an t e {1 , . . . ,1}, such that constraint Cj is satisfiable with 
respect to polynomial P^. (i.e., Aj{Pi{xi^), . . . ,Pi(x[/^)) = 0). 

As the input instance is a NO instance of GapPCS^ ^ this events happens with 
probability at most le < 4e/5. 

Event 3: For alH G {1, . . . ,p} , Pi\\> Y h, but the value of at a is contained in 
|Pi(a),...,Pi(a)}. 

To bound the probability of this event happening, we reinterpret the randomness of the 
MIP verifier. First pick b,c,a G F™ . From this we generate the constraint Cj and this 
defines the variety V = V„ o) u). Now we pick z G (F — S')’’ at random and 

this defines a = V{z). We can bound the probability of the event in consideration after 
we have chosen V, as purely a function of the random variable z as follows. Fix any 
i and V such that P^jv Y h . Note that the value of h. at a equals h{z) (by definition, 
of a, z and V). Further Pfia) = Pi\v{z). But z is chosen at random from (F — SY- 
By the Schwartz-Zippel lemma, the probability of agreement on this domain is at most 
rfc^/’'d/(|F| — |S'|). Using the union bound over the Ps we get that this event happens 
with probability at most Pfc^/’'d/(|F| — |S'|) < Lrk^ d/5{q — k^). 

We thus have that the probability of the verifier accepting a NO instance is at most 
5 + Le/5 + 4:rk^d/5{q — k^). □ 

We can now complete the construction of a 3-prover MIP for SAT and give the proof 
of Lemma 2. 

Proof (of Lemma 2). Choose 5 = Y Let co,ci be the constants that appear in The- 
orem 7. Choose e' = e/2 where e is the soundness of the MIP, we wish to prove. 
Choose e = min{5/i/I2, £'/3(9 + ci)}. Let n be the size of the SAT instance. Let m = 
elogn/loglogn, 6 = (logn)^+* and q = (log + Note that this choice of pa- 
rameters satisfies the requirements of Lemma 6. Hence, SAT reduces to GapPCS^ ^ ^ ^ 
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under length preserving reductions. Combining this reduction with the MIP verifier 
for GapPCS, we have a MIP verifier for SAT. Also 6 satisfies the requirements of 
Claim 8. Thus, this MIP verifier has soundness as given by Claim 8. Setting r = -j, we 
have that for sufficiently large n, 4rkid/6{q — k^) < Srk^d/q5 < /r/3. Hence, 
the soundness of the MIP verifier is at most <5 + 4e/5 + /i/3 < /i. The random- 
ness used is exactly 3m log q + r log q which with the present choice of parameters 
is (3 + e') log n + poly log n < (3 + e) log n. The answer sizes are clearly poly log n. 
Thus, SAT G MIP;^ i^^[(3 + £) log n, poly logn]. □ 

4 Constant Query Inner Verifier for MIPs 

In this section, we truncate the recursion by constructing a constant query “inner veri- 
fier” for a p-prover interactive proof system. An inner verifier is a subroutine designed 
to simplify the task of an MIP verifier. Say an MIP verifier 14ut, on input x and ran- 
dom string R, generated queries qi, ... ,qp and a linear sized circuit C. In the standard 
protocol the verifier would send query qi to prover Ui and receive some answer at. 
The verifier accepts if C{ai, . . . , Up) = true. An inner verifier reduces the answer size 
complexity of this protocol by accessing oracles Ai , . . . , Ap, which are supposedly en- 
codings of the responses ai, ... ,ap, and an auxiliary oracle B, and probabilistically 
verifying that the APs really correspond to some commitment to strings oi, . . . , Op that 
satisfy the circuit C. The hope is to get the inner verifier to do all this with very few 
queries to the oracles Ai , . . . , Ap and B and we do so with one (bit) query each to the 
Ai’s and seven queries to B. For encoding the responses oi, . . . , Op, we use the long 
code of Bellare et al. [4]. We then adapt the techniques of Hastad [9,10] to develop and 
analyze a protocol for the inner verifier. 

Let A = {-Fl, — 1}“ and B = {(oi, . . . , ap)|C'(ai, . . . , Op) = —1}. Let be the 
projection function m : B ^ A which maps (oi, . . . , Op) to a^. By abuse of notation, 
for f3 C B, let TTi{P) denote {ni{x)\x G /3}. Queries to the oracle Ai will be functions 
f : A ^ {+1) ~1}- Queries to the oracle B will be functions g : B {+!) ~1}- The 
inner verifier expects the oracles to provide the long codes of the strings oi , . . . , Op, 

i.e., Ai{f) = /(oi) and B{g) = g{ai, . . . ,ap). Of course, we can not assume these 
properties; they need to be verified explicitly by the inner verifier. We will assume 
however that the tables are “folded”, i.e., Ai{f) = —Ai{—f) and B{g) = —B{—g) for 
every i,f^ 9- (This is implemented by issuing only one of the queries / or — / for every 
/ and inferring the other value, if needed by complementing it.) We are now ready to 
specify the inner verifier. 

■y. Ai,...,Ap,B/A n 

» inner ■ • ■ i ‘'p )• 

1. For each each i G {1, . . . ,p}, choose fi'.A^ {+!) ~1} random. 

2. Choose /, g\,g 2 , hi,li 2 : H — > {-Fl, — 1} at random and independently. 

3. Let 5 = / (gi A 52 ) {B fi o ttQ) and = / {hi A / 12 ) {Bfi o ttQ). 

4. Read the following bits from the oracles Ai, . . . , Ap, H 

Vi = Ai(/i) , for each i G {1, . . . ,p}. 
w = B{f). 

ui = B{gi); U 2 = B{g 2 ); vi = B{hi); V 2 = B{h 2 ) 

A = B{g); Z 2 = B{h) 

5. Accept iff ru Y\Uy^ = {ni A ^ 2 ) 2:1 = {vi A V 2 )Z 2 
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It is clear that if ai, . . . , are such that C(ai, . . . , Up) = — 1 and for every i and 
/, Ai(f) = f{ai) and for every g, B{g) = g{ai, . . . , Up), then the inner verifier ac- 
cepts with probability one. The following lemma gives a soundness condition for the 
inner verifier, by showing that if the acceptance probability of the inner verifier is suf- 
ficiently high then the oracles Ai, . . . ,Ap are non-trivially close to the encoding of 
strings oi , . . . , Op that satisfy C(ai , . . . , Op) = — 1. The proof uses, by now standard, 
Fourier analysis. 

Note that the oracle Ai can be viewed as a function mapping the set of functions 
{A {-Fl,— l}}to the reals. Let the inner product of two oracles A and A! be defined 

as {A, A) = X]/^(/)^'(/)- For a Q A, let xM) = Haea/W- Then the 

Xa ’s give an orthonormal basis for the space of oracles A. This allows us to express 
A(-) = AaXa(-), where Aa = {A, Xa) are the Fourier coefficients of A. In what 
follows, we let Ai^a denote the Fourier coefficient of the table Ai. Similarly one 
can define a basis for the space of oracles B and the Fourier coefficients of any one 
oracle. 

Our next claim lays out the precise soundness condition in terms of the Fourier 
coefficients of the oracles A\, . . . , Ap. 

Claim 9. For every e > 0, there exists a <5 > 0 such that if {A, B, 

7 Ti, . . . , 7 Tp) accepts with probability at least 5 -F e, then there exist oi, . . . , Op G A 
such that C(ai, . . . , Op) = — 1 and > S for every i G {1, • . • ,p}. 

There is a natural way to compose a p-prover MIP verifier 14ut with an inner verifier 
such as fynner abovc SO as to preserve perfect completeness. The number of queries 
issued by the composed verifier is exactly that of the inner verifier. The randomness 
is the sum of the randomness. The analysis of the soundness of such a verifier is also 
standard and in particular shows that if the composed verifier accepts with probability 
^ -F2e, then there exist provers Ui, ... ,IIp such that accepts them with probability 
at least e ■ where 6 is from Claim 9 above. Thus we get a proof of Lemma 4. 

5 Scope for Further Improvements 

The following are a few approaches which would further reduce the size-query com- 
plexity in the construction of PCPs described in this paper. 

1 . An improved low-error analysis of the low-degree test of Rubinfeld and Sudan [ 1 4] 
in the case when the field size is linear in the degree of the polynomial. (It is to be 
noted that the current best analysis [3] requires the field size to be at least a fourth 
power of the degree.) Such an analysis would reduce the proof blowup to nearly 
quadratic. 

2. It is known that for every e, b > 0, MIPi^e[l, 0, n] C PCP 2_5 1 [clogn, 3] from 
the results of Hastad [10]. Traditionally, results of this nature have led to the con- 
struction of inner verifiers forp-prover MIPs and thus showing that for every b > 0 
and p there exists e > 0 and c such that 

MIPi^e[P;Ao] C PCP ;^_^_1 [r -F cloga,p-F 3] . 

Proving a result of this nature would reduce the query complexity of the small PCPs 
constructed in this paper to 6. 
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Abstract. The subclass of directed series-parallel graphs plays an important 
role in computer science. To determine whether a graph is series-parallel is a 
well studied problem in algorithmic graph theory. Fast sequential and parallel 
algorithms for this problem have been developed in a sequence of papers. For 
series-parallel graphs methods are also known to solve the reachability and the 
decomposition problem time efficiently. However, no dedicated results have been 
obtained for the space complexity of these problems - the topic of this paper. 

For this special class of graphs, we develop deterministic algorithms for the 
recognition, reachability, decomposition and the path counting problem that use 
only logarithmic space. Since for arbitrary directed graphs reachability and path 
counting are believed not to be solvable in log-space the main contribution of 
this work are novel deterministic path finding routines that work correctly in 
series-parallel graphs, and a characterisation of series-parallel graphs by forbid- 
den subgraphs that can be tested space-efficiently. The space bounds are best 
possible, i.e. the decision problems is shown to be Ll-complete with respect to 
AC°-reductions, and they have also implications for the parallel time complexity 
of series-parallel graphs. Finally, we sketch how these results can be generalised 
to extension of the series-parallel graph family: to graphs with multiple sources 
or multiple sinks and to the class of minimal vertex series-parallel graphs. 



1 Introduction 

All graphs G = (V,E) that will be considered in this paper are directed, n denotes the 
number of vertices IT of G and m the number of edges E. A well studied subclass of 
graphs are the series-parallel graphs, for which different definitions and characterisa- 
tions have been given [6]. We will consider the basic class, sometimes also called two 
terminal series-parallel graphs, that are most important for applications in program 
analysis. 

Definition 1. G = (V, E) is a series-parallel graph, SP-graphfor short, if either G is 
a line graph of length 1, that is a pair of nodes connected by a single edge, or there 
exist two disjoint series-parallel graphs Gi = {Vi, Ef), i = 1,2, with sources Vm,i, and 
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sinks Vout,i such that V = Vi U V 2 , E = EiU E 2 , and either 

(A) pHrdlld coniposition. Vifi — — ^m ,2 and Vout — ^out,i — t)out, 2 t ar 

(B) series composition; Vin = Vin^i and Vout = Vout ,2 and Vout,i = Vin, 2 - 

Since the sources and sinks of the Gi are merged every series-parallel graph G 
has a unique source and a unique sink. G it is specified by a list of edges, but we put 
no restrictions on the ordering of the edges. In particular, it is not required that this 
ordering reflects the structure of the series-parallel composition operations. Otherwise, 
recognising and handling series-parallel graphs becomes quite easy. The correctness 
and efficiency of the algorithms presented below will not depend on the representation 
of the input graphs. For example, one could use adjacency-matrices as well. 

Series-parallel graphs are suitable to describe the information flow within a program 
that is based on sequential and parallel composition. The graphical description of a 
program helps to decide whether it can be parallelised and to generate schedules for a 
parallel execution. 

To determine if a given graph G belongs to the class of series-parallel graphs is a 
basic problem in algorithmic graph theory. An optimal linear time sequential algorithm 
for this problem has been developed by Valdes, Tarjan, and Lawler in [15] long time 
ago. Also, fast parallel algorithms have been published. He and Yesha have presented an 
EREW PRAM algorithm working in time 0(log^ n) while using n-Fm processors [12]. 
Eppstein has reduced the time bound constructing an algorithm that takes only 0(log n) 
steps on the stronger PRAM model with concurrent instead of exclusive read and write, 
that requires C{m, n) processors [11]. Here G(m, n) denotes the number of processors 
necessary to compute the connected components of a graph in logarithmic time. Finally, 
Bodlaender and de Fluiter have presented an EREW PRAM algorithm using 0(log n • 
log* n) time and 0{n + m) operations [5]. 

The space complexity of this problem, however, has not been investigated success- 
fully so far. In this paper we give an answer to this question. 

The decompositon of a series-parallel graph is quite useful to decide other graph 
properties. Hence, another important task is to compute such a decomposition effi- 
ciently. In [15] a linear-time sequential algorithm for decomposing series-parallel graphs 
has been given. We will show that this task can be done in small space as well. 

For general graphs, the reachability problem, that is the question whether there ex- 
ists a path between a given pair of nodes, is the classical Af£-complete problem. By 
well known simulations, for the parallel time complexity one can infer a logarithmic 
upper bound on CRCW PRAMs. The reachability problem restricted to series-parallel 
graphs, however, can be solved in logarithmic time already by an EREW PRAM us- 
ing the minimal number (n -F m)/logn of processors [15]. Certain graph properties 
like acyclicity are also complete for N C, while for other problems their computational 
complexity is still unsolved. Recently, Allender and Mahajan have made a major step in 
classifying the computational complexity of planarity testing showing that this problem 
is hard for C and belongs to SC (symmetric Logspace) [3]. They leave as an open prob- 
lem to close the gap between the lower bound and the upper bound. In this paper we 
determine the computational complexity of a nontrivial subproblem of planarity testing 
precisely. For series-parallel graphs this question is £-complete. 

For C several simple graph problems are known to be complete with respect to AC^- 
reductions: for example, whether a graph is a forest or even a tree, or whether in a given 
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forest G two nodes belong to the same tree (for a complete list see [9,13]). In this paper 
we will prove three problems for series-parallel graphs to be ^-complete as well: the 
recognition problem, the reachability problem, and counting the number of paths mod 
2. While the hardness of these problems can be obtained in a straightforward way, it re- 
quires a lot of algorithmic effort to prove that the lower bound can actually be achieved. 
Thus, the main technical contribution of this paper are new graph-theoretical notions 
and algorithmic methods that allow us to solve these problems using only logarithmic 
space. 

Furthermore, not only decision problems for series-parallel graphs turn out to be 
tractable. A decomposition of such graphs can be computed within the same space 
bound as well. For general graphs counting the number of paths is one of the generic 
complete problems for the class #£ [2]. Thus, this problem is not computable in T L, 
the functional deterministic log-space complexity class, unless certain hierarchies col- 
lapse. We will prove that restricting to series-parallel graphs the counting problem can 
be solved in TC. This will be achieved by combining our space efficient reachabil- 
ity decision procedure with a modular representation of numbers requiring only little 
space, and the recent result that a Chinese Remainder Representation can be converted 
to the standard binary one in logarithmic space [8]. 

Because of the relation between L and parallel time complexity classes defined by 
the EREW PRAM model (see [14]) these new algorithms can be modified to solve these 
problems in logarithmic time on EREW PRAMs as well. Finally, these results can also 
be extended to generalizations of series-parallel graphs: multiple source or multple sink, 
and minimal vertex-series-parallel graphs. 

This paper is organized as follows. In Section 2 we will prove the £-hardness of 
the reachability and the recognition problem. Procedures solving these problems within 
logarithmic space will be described in detail in Section 3. Section 4 outlines an al- 
gorithm that generates an edges-ordering that reflects the structure of a given series- 
parallel graph. Based on this ordering we sketch a decomposition algorithm in Sec- 
tion 5. In Section 6, we combine the methods presented so far to solve the path count- 
ing problem. Finally, in Section 7 it will be indicated how this results can be extended 
to generalizations of series-parallel graphs. The paper ends with some conclusions and 
open problems. 

2 Hardness Results 

To establish meaningful lower bounds for the deterministic space complexity class L 
one has to restrict the concept of polynomial time many-one reductions to simpler func- 
tions. We consider the usual requirement that the reducing function / can be computed 
in AC^. The £-hardness for series-parallel graphs can be shown in a direct way. 

Theorem 1. The following problems are hard for L under AC^ reducibility: (i) recog- 
nition of series-parallel graphs, (ii) reachability in series-parallel graphs, and (Hi) 
counting the number of paths mod 2. 

Proof: Let L be a language in C and M a logarithmic space-bounded deterministic 
Turing machine accepting L by taking at most rf steps on inputs X of length n, where 
fc is a fixed exponent. We may assume that M has unique final configurations Cacc, 
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the accepting one, and Crej, the rejecting one. In addition, all configurations C of M 
on X are time-stamped, that means are actually tuples {C, t) with 0 < f < n*. Then 
the successor configuration of (C, t) is (C", t + 1) if t < and M can move in one 
step from C to C". If C is a final configuration and t < then (C, t + 1) is the 
successor of (C, f). For input X we construct a directed graph Gx, where the time- 
stamped configurations (C, t) are the vertices of Gx and edges represent (the inverse 
of) the successor relation: Gx contains the edge ((C", f -F 1), (C, t)) iff (C", f -F 1) is a 
successor of (C, f). Obviously, Gx is a forest consisting of trees with roots of the form 
(C, n*). To prove the hardness of the rechability problem we augment Gx by two new 
nodes u and v. For every configuration {C, n^) the edge (u, (G, n*)) is added, and for 
every leaf (C, t) the edge ((C, t),v). It is easy to see that the resulting graph is series- 
parallel with source u and sink v. Furthermore, it contains a path from {Cacc, to 
(Cmit, 0), where {Cinit,f>) represents the starting configuration of M iff M accepts X. 
The reduction itself can be computed in AC^. 

The hardness of the recognition problem and the counting problem can be shown in 
a similar way. I 



3 Recognition and Reachability in Logspace 

Establishing corresponding upper bounds is not obvious at all. We will give a space ef- 
ficient characterization of series-parallel graphs by forbidden subgraphs and exploit the 
structure of internal paths very thoroughly. Assume that the nodes of the input graph 
G are represented by the set of numbers {1, 2, . . . , n}. G is given by a list of edges 
(ii, ji), (* 2 , J 2 ), ■ • ■ (*m, jm), where ik,jk are binary representations of the names of 
the nodes. Let pred(n) denote the set of direct predecessors of v, and pred(n, i) the 
i-th direct predecessor of v according to the ordering implicitly given by the specifi- 
cation of G. Similarly, let succ(n) and succ(n, i) be the set of direct successors of v, 
resp. its i-th direct successor. succ(v, i) and pred(r), i) can be computed in deterministic 
logarithmic space: the Turing machine searches through the list of edges looking for the 
i-th entry that starts (resp. ends) with v. 

Define pred”*"(n), resp. succ+(n), as the transitive closure ofpred(v), resp. succ(v), 
not containing v, and pred*(n):= pred^(t!) U {?;} and succ*(n):= succ+(v) U {v}. To 
shorten the notation, let us introduce the predicate PATH(m, v) being true iff the given 
graph G possesses a path from node u to v. Thus, 

PATH(u, v) 4=^ u e pred*(v) 4=^ v G succ*('u) . 
Remember that deciding PATH for arbitrary graphs is Af£-complete. To construct a 
deterministic space efficient algorithm solving this problem for series-parallel graphs 
we introduce the following concepts: 

Im-downjn) := the max. acyclic path v = ui,U 2 , ■ ■ ■ ,ui with m+i — succ(ui, 1), 
Im-up(r') := the max. acyclic path v = ui,U 2 , ■ ■ ■ ,ui with = pred(ui, 1), 

lm-pred*(w) := {u \ Im-down('u) n lm-up(r!) yF 0}, and 
lm-succ*(v) := {u \ Im-down(v) n Im-up('u) yF 0} . 

Here, “Im” stands for left-most, that means in each node Ui the path follows the first 
edge as specified by the representation of G. A path being acyclic requires that all its 
nodes Ui are different. Thus, a maximal acyclic down-path either ends in a sink or stops 
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immediately before hitting a node as the left-most sueeessor a seeond time. These sets 
ean be deeided by the proeedure membership-test 1. The algorithm for testing whether 
u G Im-up(w) is just the symmetrie dual. 

procedure membership-testl[u G Im-down(w)] 

1 let n be the number of nodes in G; x := v, i := 1; 

2 while X ^ u and |suee(x)| > 0 and i <n do 

3 let X := suee(x, 1); i := i + 1 od 

4ifx = uthen return TRUE else return FALSE 





Fig. 1. u ^ Im-down(v) and v ^ Fig. 2. v G lm-pred*('u) and u G 
Im-up('u). lm-suee*(w). 



To eheek if u G lm-pred*(v) one ean use the proeedure membership-test2, whieh 
uses membership-testl to deeide Im-up and Im-down. 

procedure membership-test2[u G lm-pred*(v)] 

1 result := FALSE 

2 f orall nodes a; in G do 

3 if x G Im-down('u) and x G Im-up(w) then let result := TRUE od 

4 return result 



In the dual way we ean test whether u G lm-suec*(w). Henee it follows: 

Lemma 1. For an arbitrary graph G and node v the membership problem for the sets 
Im-downjri), lm-up(r!), lm-pred*(r;), and\m.-mcc*{v) can be solved deterministically in 
logarithmic space. 

A graph G is ealled sf-connected if G is has a unique souree named s and a unique 
sink named t, and for every node v it holds: PATH(s, v) and PATH(w, t). 

We start with the proeedure preliminary -test. For an aeyelie graph G it returns TRUE 
iff G is st-eonneeted. If G eontains a eyele G = {vi,V 2 , ■ ■ ■ ,vi) the proeedure will 
deteet the eyele if it is on a left-most path. In sueh a ease the proeedure outputs FALSE. 
Cyeles that are not of this form will not be deteeted, and the proeedure erroneously may 
output TRUE. 
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procedure preliminary-test(G') 

1 i f not [G has a unique source s and a unique sink t] 

2 then return FALSE and exit 

3 f orall nodes n in G do 

4 if i ^ Im-down(n) or s ^ Im-up(w) then return FALSE and exit 

5 return TRUE 



Lemma 2. The procedure preliminary-test can be implemented deterministically in 
log-space. Moreover, (/preliminary -test(G) outputs TRUE then G is st-connected. If 
it outputs EALSE then at least one of the following conditions holds: {i) G has more 
then one source or more than one sink, or (ii) G is st-connected, but it has a cycle. 

The proof of this lemma is straightforward and we omit it. Note that a graph G with 
output TRUE can still have a cycle. To detect this property is difficult for deterministic 
machines since this question can easily be shown to be Af£-complete. Therefore, we 
look for a simpler task. 

Let W denote the graph shown in Fig. 3 . A graph W' is homeomorphic to W if 
it contains four distinct vertices a, 6 , c, d and pairwise internally vertex disjoint paths 
Pab, Pac, Pbdi Pcd and Pbc. If G Contains a homeomorphic image ofW as a subgraph 
then W is called a minor of G. The following characterization of series-parallel graphs 
by forbidden minors has been known for long [ 10 , 15 ]. Let G be an st-connected acyclic 
graph. Then G is series-parallel iff W is not a minor of G. 




Fig. 3. The forbidden Minor W. Fig. 4. The Forbidden Induced Subgraph H. 



To make series-parallel graph recogniton space efficient, instead of searching for 
the forbidden minor W we will use the following characterization. Let iJ be a graph 
with four distinct nodes Zi,Z2,Z3, Z4 such that 

1 . (zi, 02), (^3, Z4) are edges of H and PATH(zi, Z4), 

2 . ^ PATH(zi,Z 3) and ^ PATH(z2, 2:4)- 

These conditions are illustrated in Fig. 4 . In the following we will show how H can 
be used to determine whether a graph is series-parallel. We say that H is an induced 
subgraph of G if G contains four nodes Zi, Z2, ^3, -^4 which fulfil these connectivity 
conditions. 

Theorem 2. Let G be an st-connected acyclic graph. Then G is series-parallel iff it 
does not contain H as an induced subgraph. 

This follows by showing that a sf-connected acyclic graph G contains H as an 
induced subgraph iff FL is a minor of G. 
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Now, we will deduce the key property that makes reachability in series-parallel 
graphs easier compared to arbitrary graphs. Although the parallel composition operator 
introduces a lot of nondeterminism into the structure of these graphs when trying to 
find a path from a node u to a node v this question can be solved by considering the 
unique Im-down-path starting at u and the unique Im-up-path starting in v and deciding 
whether these two intersect. In other words, it holds: 

Theorem 3. IfG is series-parallel then pred*(v) = lm-pred*(w) for every node v. 

Proof: Assume that pred*(w) f lm-pred*(v) for some node v of G. We will show that 
then H has to be an induced subgraph of G - a contradiction to Theorem 2. 

Obviously, v cannot be the source s of G. Since G is sf-connected and acyclic every 
Im-down-path from an arbitrary node u has to terminate in the sink t. Thus, for t holds 
pred*(t) = lm-pred*(t) = V, and hence v t. 

Let ui e pred*(u) \ lm-pred*(u). Since every Im-up-path terminates in the source s 
we can conclude ui s. Let ui, . . . , t be the leftmost down-path Im-down(ui). 
ui ^ lm-pred*(u) implies that m v for all i G [L.fc]. Furthermore, let v\ = 
v,V 2 , ■ ■ ■ ,vi = she. the leftmost up-path Im-up(u) from v. 

Since u\ G pred*(vi) there exists a non-trivial path from u\ to vi. On the other 
hand, because of ve = s and vi = v s it holds ^ PATH(vi,u^) , and similarly 
because ofuk = t and vi t, ^ PATH(ufc, vf). Hence, there exist i G [L./c — 1] and 
j G [1.1 — 1] such that PATH(ui, Vj), ^ PATH(ui, Wj+i), and ^ PATH(ui+i, Wj). 

The nodes zi := Ui, Z 2 '■= Uj+i, Z 3 := Vj+i, and 04 := Vj prove that H is an 
induced subgraph of G. I 

Thus, if for some node v of G the relation pred*(w) = lm-pred*(v) is violated one 
can conclude that G is not series-parallel. This equality, however, can be tested space 
efficiently. 

Lemma 3. There exists a deterministic logarithmic space-bounded Turing machine 
that for arbitrary V G G decides whether pred*{v)=lm-pred*{v). 

Proof: Assume that pred*(u) 7 ^ lm-pred*(w). First, we claim that there has to be an edge 
{u,w) G E such that u G pred*(v) \ lm-pred*(v) and w G lm-pred*(u). To see this, 
let X G pred*(w) \ lm-pred*(u) and u\ = x,U 2 ,us, . . . ,Uk = v be a down-path from x 
to V. Obviously, ui,U 2 , ■ ■ ■ ,Uk G pred*(w), ui ^ lm-pred*(u), and Uk G lm-pred*(v). 
Therefore, there exists an index i G [L./c — 1] such that m G pred*(w) \ lm-pred*(u) 
and u i+l G lm-pred*(u). This proves our claim. Now it is easy to see that the following 
algorithm answers the question whether pred*(u)=lm-pred*(u): 

procedure equality-test[pred*(u) = lm-pred*(u)] 

1 result := TRUE 

2 f orall edges (u, w) in G do 

3 if (u ^ lm-pred*(u)) A (w G lm-pred*(w)) then result := FALSE od 

4 return result 



I 



Corollary 1. Let G be an st-connected graph with pred*(w) = lm-pred*(u) for every 
node V. Then reachability within G can be decided in C. 
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procedure SER-PAR(G) 

1 if preliminary -test(G) returns FALSE then return FALSE and exit 

2 f orall nodes u in G do 

3 if pred*(v) 7^ lm-pred*(v) then return FALSE and exit od 

4 f orall pairs of nodes x,ymG do 

5 if a; G lm-pred*(y) A y G lm-pred*(x) 

6 then return FALSE and exit od 

7 f orall pairs of edges (ti, Z 2 ), (-^ 3 , ^ 4 ), with zi ^ do 

8 if zi G lm-pred*(z 4 ) A ^ lm-pred*(t 3 ) f\ lm-pred*(z 4 ) 

9 then return FALSE and exit od 
10 return TRUE 

The proeedure SER-PAR speeified above deeides for an arbitrary graph G whether 
it is series-parallel. To prove its eorreetness we argue as follows. From Lemma 2 one 
ean eonelude that the algorithm stops at line 1 and outputs FALSE if G has more then 
one souree or more than one sink, or G is st-eonneeted, but it has a eyele. Henee, G is 
not series-parallel and the answer FALSE is eorreet. On the other hand, if the proeedure 
does not stop at line 1 then G is st-eonneeted. 

Furthermore, if SER-PAR(G) outputs FALSE in line 3 then pred*(u) 7 ^ lm-pred*(u) 
for some node v. By Theorem 3 it follows that this answer is eorreet, too. If the algo- 
rithm eontinues, we ean presuppose at the beginning of line 4 that G is st-eonneeted and 
for any v it holds pred*(u) = lm-pred*(u). In lines 4-6 we eheek whether G is aeyelie, 
and stop if not. The answer will be eorreet sinee lm-pred*(y) eontains all predeeessors 
of a node y. 

Let us reeapitulate the eonditions a graph G has to fulfil sueh that SER-PAR(G) 
does not stop before line 7: G has to be st-eonneeted, aeyelie and for every pair of 
nodes x,y in G it holds: PATH(y, x) 4=^ y G lm-pred*(x). This guarantees that in 
lines 7-9 the existenee of H as an indueed subgraph is tested eorreetly. Finally, sinee all 
tests applied ean be performed in deterministie logarithmie spaee we ean eonelude: 

Theorem 4. The question whether a graph is series-parallel can be decided in C. 



4 An Edge Ordering Algorithm 

For a graph speeified by a list of edges we have made no assumptions about their or- 
dering. In partieular, this ordering is not required to refleet the eonstruetion proeess of 
the series-parallel graph in any way. In this seetion we present a log-spaee algorithm 
that given a series-parallel graph G outputs a speeial ordering ealled SP-ordering. The 
erueial property of this ordering is that for any series-parallel eomponent G with souree 
V all direet sueeessors of u in G are enumerated with eonseeutive integers. Speaking 
formally, for a node v the sequenee SP-succ(u) is a permutation of suee(w) sueh that 
for all u G suee+(u) the set { i | SP-succ(u,i) G pred*(u) } eonsists of eonseeutive 
integers. Here, for 1 < i < |suee(w)| the value SP-succ(w,i) denotes the i’th vertex 
in the SP-ordering of suee(w). Reeall that suee(w,i) is the i’th direet sueeessor of v 
aeeording to the ordering implieitly given by the input speeifleation. Henee, in general 
SP-succ(w, i) will be different from suee(w, i). To eompute the SP-ordering we will 
use suee(v, i) and the following funetion for a node v different from s and t: 
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START(m) := nearest v G pred^(tt), such that any path from s to tt contains v. 

It it is easy to see that START(tt) gives the source t; of a smallest series-parallel com- 
ponent containing u and its direct predecessors in pred(u). If pred(rt) contains only a 
single node v then START(n) = v. Otherwise, START(u) can be computed by finding 
the nearest common predecessor of the left-most up-paths from any direct predecessor 
of u to the source of G. Let START“^(w) := {u | START(m) = w}. Both START(m) 
and START' ^(w) can be computed in logarithmic space. 

Let us now introduce an important notion, which arises from our analysis of series- 
parallel graphs in the previous section. Let vi ^ V2 be two arbitrary nodes of G. Then 
we define the set of bridge nodes between v\ and V2 as follows: 

BRIDGES(wi,r;2) := {u G START”^(r;i) npred+(w2) | 

Vm G succ^('u) npred^(r;2) : w ^ START“^(r;i)} . 

Obviously, BRIDGES(r;i, V2) 7^ 0 if START'^(vi) npred'^(r;2) 7^ 0 . 




Fig. 5. Marked: The Nodes in START ^(wi); Black: The subset 
BRIDGES(?;i,W2)- 



Using the functions succ(r;,i) and START(ri), the set BRIDGES(?;i, V2) can be 
computed deterministically in logarithmic space. Let BRIDGES(?;i, ^25 *) be the i-th 
element of such an enumeration of the nodes in this set. Furthermore, given a node 
V3 G BRIDGES(?;i , V2) and vi, the lower endpoint V2 can be determined in logarithmic 
space as well. 

We now describe a recursive procedure SP-sequence(w, u) that outputs the sequence 
of direct successors of a node v in SP-ordering. This procedure will initially be called 
with the successor tt of v in START“^(t!) that is furthest from v. By definition of 
START(ti) this successor is unique. 

procedure SP-sequence(t', tt) 

1 if tt G succ(t>) then output tt 

2 for t = 1 to |BRIDGES(t!, -tt)| do SP-sequence(t;, BRIDGES(t>, tt, i)) od 

Using a log-space algorithm to compute BRIDGES(t!, tt, i) one can implement this 
procedure with logarithmic space as well. Furthermore, SP-succ(t!, t) can be com- 
puted by counting the nodes in the output of the procedure SP-sequence(t>, FINAL(t>)). 
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5 The Decomposition in Log- Space 

The decomposition tree of a series-parallel graph provides information how this graph 
has been built using the parallel and serial constructors. 

Definition 2. A binary tree T = (Vr, Et) with a labeling function a : Vt ^ {p, s} U 
E is called a decomposition tree of an SP-graph G = {V, E) iff leaves of T are 
labeled with elements of E, internal nodes with p or s and G can be generated recur- 
sively using T as follows: IfT is a single node v then G consists of the single edge cr(v). 
Otherwise, let T\ (resp. T 2 ) be the right (resp. left) subtree ofT and Gi be SP- graphs 
with decomposition tree Ti: if a(y) = p (resp. s) then G is the parallel (resp. serial) 
composition of G\ and G 2 . 

The algorithm to generate a decomposition tree is based on the functions START(w), 
and BRIDGES(u, v, i) described in the previous section. Given a series-parallel graph 
G with source s and sink t, the procedure SP-DECOMP(s, t) outputs the root DTR of a 
decomposition tree of G. As an example for such a decomposition see Fig. 6. 

procedure SP-DECOMP(m, v) 

1 if |BRIDGES(m, w)| > 1 then do 

2 getnode(r); a(r) :=s; 

3 left(r) := SP-DECOMP('u, BRIDGES(m, v, 1)) 

4 right(r) := SP-DECOMP(BRIDGES(m, v, 1), w) 

5 DTR:=r; 

6 for i:=2 to |BRIDGES('U, ?;)| do 

7 getnode(c); cr(c) :=s; 

8 left(c) := SP-DECOMP(m, BRIDGES(m, v, i)) 

9 right(c) := SP-DEC0MP(BRIDGES('U, v, i),v) 

10 getnode(6); a(b) :=p; left(6) := DTR; right(6) := c; 

11 DTR & endfor 

12 endif 

13 if (u,v) £ E then do 

14 getnode(c); a(c) := (u,v); 

15 if |BRIDGES(m, n)| > 0 then do 

16 getnode(&); a(b) :=p; left(6) := DTR; right(6) := c; 

17 DTR b od 

18 else DTR := c endif 

19 endif 

20 return DTR. 

To achieve space efficiency we do not want to store the values of the variables for 
each recursive activation of SP-DECOMP as it is done in standard implementation of 
recursion. In our special situation these values of the calling activation of SP-DECOMP 
can be recomputed when returning from a recursive call, thus we don’t have to store 
them explicitly. 



Theorem 5. The decomposition tree of a SP-graph can be computed in TC. 
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Fig. 6. An example of a ecomposition tree generated by SP-DECOMP(wi, tt 2 ): 
are subtrees generated by SP-DECOMP that consist of a single edge or a larger com- 
ponent. 



6 Path Counting Problems 

In this section we show that for series-parallel graphs the classical problem to count 
the number of paths can be solved in TC. For general graphs counting the number of 
paths is not solvable in TC - unless certain hierarchies collapse - since this problem 
is one of the generic complete problems for the class #£ [2]. Speaking more precisely, 
let us define the functional problem #PATH as follows: given a graph G and nodes a, h 
estimate the number of different paths from a to 6 in G. 

Theorem 6. Restricted to series-parallel graphs ^PATH can be computed in TC. 

Proof: Consider the subgraph Gab of G induced by V := succ*(a) n pred*(&). It is 
either empty (and then ^j(ftPATH(a, h) = 0), or it is a series-parallel graph with source a 
and sink b. This follows from the fact that the predicate PATH restricted to nodes in V 
is identical on G and Gab- Furthermore, all paths from a to 6 in G occur in Gab as well, 
thus the number of paths is identical. A simple induction shows that ^j(ftPATH(a, b) can 
be bounded by 2"+™. Using the reachability algorithm presented in Section 3 we can 
also decide in log-space whether an arbitrary edge of G belongs to Gab- 

Let T = {Vt, Et) be the decomposition tree of Gab and ^ < n -F m be its size. 
We interpret the tree as an arithmetic expression as follows. Every leaf represents the 
integer 1. An internal node v oi T labeled by a{v) = s (resp. p) corresponds to a 
multiplication (resp. addition) of the expressions given by the sons of v. It is easy to see 
that the value p of the root of T equals #PATH(a, b). 

Below we sketch how p can be computed in logarithmic space. Let pi < P2 < ■■■ 
be the standard enumeration of primes. The prime number theorem implies 

Up,<n+m Pr = > #PATH(a, &) . (1) 

Using the log-space algorithm of [7] one can transform T into a binary tree T' of 
depth 0(log z) representing an arithmetic expression with the same value p as T. 

We evaluate T' mod pi using the algorithm in [4]. For < n-\- m this algorithm 
works in space 0(log z -F log(n -F m)) < 0(log n). By inequality (1), taking all pi < 
n -\- m the values p mod pi give a Chinese Remainder Representation of p. Using the 
recent result of Chiu, Davida, and Litow [8] that such a representation can be converted 
to the ordinary binary representation in log-space, finishes the proof I 
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Using the hardness result shown in Section 2 it follows, that the problem to compute 
:;^PATH mod 2 is /^-complete. Using the techniques presented so far one can also solve 
some other counting problems, like determining the size of Gab in TC. 

1 Generalisations of Series-Parallel Graphs 

First we will consider graphs with several sources, but still with a unique sink. 

Definition 3. The family of multiple source series-parallel graphs, MSSP-graphs for 
short, are an extension of series-parallel graphs, adding the following constructor: 

(C) In-Tree composition.- a graph G = {V, E) is generated from MSSP-graphs Gi = 
{Vi, Ei) for i = 1,2 by selecting a node v in Gi, identifying it with the sink Vout ,2 of 
G 2 and forming the union of both graphs: U := Ui U V2 and E := Ei U E 2 - 

An in-tree composition may be applied several times, but only at the end. As soon as 
a graph has several sources the series and the parallel constructor can no longer be used. 
Multiple sink series-parallel graphs with a unique source can be defined in the dual way. 
In the following, we will restrict ourselves to the first extension - the main results hold 
for both classes. Unlike ordinary series-parallel graphs, the reachability problem for 
MSSP-graphs cannot be solved by the following leftmost path. PATH(x, y) not longer 
implies Im-down(x) n lm-up(j/) 0. To solve this problem we have to use a more 
sophisticated strategy. Define 

Elude(t') := { M I 3 wi,W 2 S succ('u) : v G lm-down(r(;i) \ lm-down(t(; 2 ) } 
and minElude(v) as the closest predecessor m of v contained in Elude(r'). It can be 
shown that such a unique node always exists and that it lies on Im-up(w). If Elude(r') = 
0 we set minElude(r') := v. 




Fig. 7. a) Left-Most Paths that Do not Meet, b) Computing Elude(r!) 



Let minElude(w, 0) := v and minElude(r!, i) := minElude(minElude(r', i — 1)) for 
i > 0. Finally, define 

minElude*(w) := lJjgjsjminElude(r', i) 

Elude— pred*(v) := { m | dru G minElude*(r;) : w G lm-down(r() } . 
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Lemma 4. ^Elude(v) ^ 0 then there exists a unique u G Elude(w) fulfilling the 
conditions in the definition o/minElude. It can be computed in deterministic log-space. 
Furthermore, for every MSSP-graph G holds: pred*(tt) = Elude— pred*(w) for all v. 
For an arbitrary graph and node v, the equality pred*(tt) = Elude— pred*(v) can be 
checked in C. 

To verify whether a given graph is a MSSP we ean make use of the forbidden sub- 
graph H again. However, a set of nodes fulfilling its PATH eonditions ean oeeur in 
MSSP-graph G, but only if Z 3 belongs to another eomponent G 2 of G than 2:1 and Z 2 , 
whieh is eonneeted to the rest graph of G via an in-tree eomposition step at Z4. Sinee 
this ean also be verified in log-spaee we obtain: 

Theorem 7. Recognition and reachability for MSSP-graphs is in C. 

The notion of deeomposition trees ean be extended to this graph family: first gener- 
ate nodes that represent the in-tree eomposition steps, then the subtrees that deseribes 
the deeomposition of the basie series-parallel graphs. 

Theorem 8. A decomposition tree of a MSSP-graph can be computed in TC. 

The eounting algorithm for SP-graphs ean be extended to this elass as well. 

Theorem 9. For MSSP-graphs the function ^PATH can be computed in TC. The same 
holds for the size of subgraphs of the form Gab- 

Finally, let us remark another generalisation of series-parallel graphs: the minimal 
vertex-series-parallel graphs, MVSP for short. It has been shown that the line graphs of 
MVSP-graphs are elosely related to series-parallel graphs (see [15] for the definition of 
MVSPs, line graphs and further details). Using a slight modifieation of our algorithms 
and the line graph of a MVSP graph we ean extend all results shown in this paper to 
MVSP-graph 



8 Conclusions and Open Problems 

A deterministie Turing maehine working in spaee S > log ean be simulated by an 
EREW PRAM in time 0{S) (see e.g. [14]). The maehine may use an exponential num- 
ber of proeessors with respeet to S. Therefore, we get immediately that the graph prob- 
lems investigated in this paper ean be solved in logarithmie parallel time. The simulation 
of spaee-bounded Turing maehines by PRAMs ean even be performed by the EROW 
model (exelusive-read owner-write). Henee we ean deduee 

Corollary 2. For series-parallel graphs and their extensions considered above, recog- 
nition, reachability, decomposition, and path counting can be done in logarithmic time 
on EROW PRAMs with a polynomial number of processors. 

The exaet number of proeessors depends on the time eomplexity of the Turing ma- 
ehine. Sinee our basie log-spaee algorithms require time 0{rf) for some eonstant c 
signifieantly larger than 1, we probably will not aehieve a linear number of proees- 
sors this way. For the reaehability problem in series-parallel graphs it is known that an 
EREW PRAM ean solve it in logarithmie time using n/ log n proeessors [15]. But it is 
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still open whether also recognition and decomposition can be done in logarithmic time 
using at most a linear number of processors. 

If we switch to undirected graphs the problems considered here seem to be inher- 
ently more difficult. In the undirected case series-parallel graphs can be characterized 
as the set of graphs containing no clique of size 4 as a minor [10]. 

In contrast to the series-parallel graph family, the reachability problem for arbitrary 
graphs seems to be easier in the undirected case than in the directed case. From [1] we 
know that the undirected version can be solved by a randomized log-space bounded 
machine, whereas no randomized algorithm is known for the directed case. Are there 
other distinctions of this kind? 
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Abstract. In 1974 R. Fagin proved that properties of structures which are in NP 
are exactly the same as those expressible by existential second order sentences, 
that is sentences of the form: there exist P such that (p, where P is a tuple of 
relation symbols and ip is a first order formula. Fagin was also the first to study 
monadic NP: the class of properties expressible by existential second order sen- 
tences where all the quantified relations are unary. 

In [AFSOO] Ajtai, Fagin and Stockmeyer introduce closed monadic NP: the class 
of properties which can be expressed by a kind of monadic second order exis- 
tential formula, where the second order quantifiers can interleave with first order 
quantifiers. In order to prove that such alternation of quantifiers gives substantial 
additional expressive power they construct graph properties Pi and P 2 : Pi is ex- 
pressible by a sentence with the quantifier prefix in the class (av)* 3* (av)* * but 
not by a boolean combination of sentences from monadic NP (i.e with the prefix 
of the form 3*(av)*) and P 2 is expressible by a sentence 3*(av)* 3*(av)* but 
not by a Boolean combination of sentences of the form (av)* 3*(av)*. A natural 
question arises here whether the hierarchy inside closed monadic NP, defined by 
the number of blocks of second order existential quantifiers, is strict. 

In this paper we present a technology for proving some non expressibility results 
for monadic second order logic. As a corollary we get a new, easy, proof of the 
two results from [AFSOO] mentioned above. With our technology we can also 
make a first small step towards an answer to the hierarchy question by showing 
that the hierarchy inside closed monadic NP does not collapse on a first order 
level. The monadic complexity of properties definable in Kozen’s mu-calculus is 
also considered as our technology also applies to the mu-calculus itself. 



1 Introduction 

1.1 Previous Works 

In 1974 R. Fagin proved that the properties of stmetures whieh are in JVV are exaetly 
the same as those expressible by existential seeond order sentenees, known also as 

* This paper has been written while the author was visiting Laboratoire Bordelais de Recherche 
en Informatique, in Bordeaux, France. I was also supported by Polish KBN grant 2 P03A 018 
18. 

* In this paper we use the symbols a, v for the first order quantifiers and 3, V for the monadic 
second order quantifiers 
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sentences, i.e. sentences of the form: there exist relations P such that (f, where P is a 
tuple of relation symbols (possibly of high arity) and is a first order formula. 

Fagin was also the first to study monadic NP: the class of properties expressible by 
existential second order sentences where all quantified relations are unary. The reason 
for studying this class was the belief that it could serve as a training ground for attacking 
the “real problems” like whether NP equals co-NP. It is not hard to show ([F75]) that 
monadic NP is different from monadic co-NP. A much stronger result has even been 
proved by Matz and Thomas ([MT97]). They show that the monadic hierarchy, the 
natural monadic counterpart of the polynomial hierarchy, is strict (a property is in the 
k-th level of the monadic hierarchy if it is expressible by a sentence of monadic second 
order logic where all the second order quantifiers are at the beginning and there are at 
most fc — 1 alternations between second order existential and second order universal 
quantifiers). 

An important part of research in the area of monadic NP is devoted to the possi- 
bility of expressing different variations of graph connectivity. Already Fagin’s proof 
that monadic NP is different from monadic co-NP is based on the fact that connectiv- 
ity of undirected graphs is not expressible by a sentence in monadic while non- 
connectivity obviously is. Then de Rougemont [dR87] and Schwentick [S95] proved 
that connectivity is not in monadic NP even in the presence of various built-in relations. 

However, as observed by Kanellakis, the property of reachability (for undirected 
graphs) is in monadic NP (reachability is the problem if, for a given graph and two dis- 
tinguished nodes s and t, there is a path from s to f in this graph). It follows that connec- 
tivity, although not in monadic NP, is expressible by a formula of the form vxvy '3P(p. 
This observation leads to the study of closed monadic NP, the class of properties ex- 
pressible by a sentence with quantifier prefix of the form ( 3* (av)*)*, and of the closed 
monadic hierarchy, the class of properties expressible by a sentence with quantifier 
prefix of the form (( 3*(av)*)*( V*(av)*)*)*. 

In [AFSOO] and [AFS98] Ajtai, Fagin and Stockmeyer argue that closed monadic 
NP is even a more interesting object of study than monadic NP: it is still a subclass of 
NP (and also the k-th level of closed monadic hierarchy is still a subclass of the k-th 
level of polynomial hierarchy), it is defined by a simple syntax and it is closed under 
first order quantifications. In order to prove that such alternation of quantifiers gives 
substantial additional expressive power they construct graph properties Vi and V 2 such 
that Vi is expressible by a sentence with the quantifier prefix in the class (av)* 3* (av)*, 
but not by a Boolean combination of sentences from monadic NP (i.e with the prefix 
of the form 3*(av)*) and V 2 is expressible by a sentence 3*(av)* 3(av)* but not by 
a Boolean combination of sentences of the form (av)* 3*(av)*. The non expressibility 
results for Vi and V 2 in [AFSOO] are by no means easy and constitute the main technical 
contribution of this long paper. As the authors write: Our most difficult result is the fact 
that there is an undirected graph property that is in closed monadic NP but not in 
the first order/Boolean closure of monadic NP. In the game corresponding to the first 
order/Boolean closure of monadic NP, played over graphs Go and G\, the spoiler not 
only gets to choose which of Go and G\ he wishes to color , but he does not have to 
make his selection until after a number of pebbling moves had been played. Thus, not 
only are we faced with the situation where the spoiler gets to choose which structure 
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to color, but apparently also for the first time, we are being forced to consider a game 
where there are pebbling rounds both before and after the coloring round. 

There are many natural open questions in the area, most of them stated in [AFSOO]: 
is the hierarchy inside closed monadic NP strict ? We mean here the hierarchy defined 
by the number of blocks of second order existential quantifiers, alternating with first 
order quantifiers. Is there any property in the monadic hierarchy (or, equivalently, in the 
closed monadic hierarchy) which is not in closed monadic NP ? Is the closed monadic 
hierarchy strict ? These questions seem to be quite hard: so far we do not know any 
property in the (closed) monadic hierarchy which would not be expressible by a sen- 
tence with quantifier prefix 3 * (va) * 3 * (va) * . 

1.2 Our Contribution 

In this paper we present an inductive and compositional technology for proving some 
non expressibility results for monadic second order logic. In particular, our technology 
gives an alternative simple solution to all the technical problems described in the cita- 
tion from [AFSOO] above. But unlike the construction in [AFSOO], which is specific for 
first order/Boolean closure of monadic NP, our technology is universal: it deals with 
first order/Boolean closure of most monadic classes. 

To be more precise, we show how to construct, for any given property S not express- 
ible by a sentence with quantifier prefix in some non trivial^ class W, two properties 
bool{S) and reach{S) which are not much harder than S and such that (1) property 
bool{S) cannot be expressed by boolean combination of sentences with quantifier pre- 
fix in W and (2) property reach{S) cannot be expressed by a sentence with quantifier 
prefix vw where u G (a -F v)* is a block of first order quantifiers and w G W. Saying 
that bool{S) and reach{S) are not much harder than S we mean that if S is expressible 
by a sentence with quantifier prefix in some class V then bool{S) is expressible by a 
sentence with the prefix of the form aaw where v G V and reach{S) is expressible by 
a sentence with the prefix of the form 3vu where v G V. The non expressibility proof 
for reach generalizes the second author’s proof of the fact that directed reachability is 
not expressible by a sentence with the prefix of the form (va)* 3*(va)* [M99]. 

Our lower bounds are proved in the language of Ehrenfeucht-Fraisse games. To 
show that, for example, reach{S) cannot be expressed by a sentence with a prefix 
of the form vaw where ru G Fk we assume as (inductive) hypothesis that there are 
two structures P G S and R ^ S such that Duplicator has a winning strategy in the 
game (corresponding to the prefix w) on {P,R). Then we show how to apply some 
graph composition methods to get, from P and R, new structures Pi G reach{S) and 
Ri ^ reach{S) such that Duplicator has a winning strategy in the game (corresponding 
to the new prefix vary) on (Pi, Pi). But since we know nothing about P and P our 
knowledge about Pi and Pi is quite limited, so the strategy for Duplicator uses as a 
black box the unknown Duplicator’s strategy in a game on (P, P). 

With our technology we can make the first small step answering the hierarchy ques- 
tions. To be more precise, we show that the hierarchy inside closed monadic NP does 

^ See definition below. 
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not collapse on any first order level. Since we do not need to care if the w (the prefix 
which does not express S) contains, or not, universal second order quantifiers a variety 
of results of this kind can also be proved with our technology about the structure of 
closed monadic hierarchy. 

A new, very easy, proof of the results from [AFSOO] is just a corollary of our method. 

It also appears that - with minor modifications - the above inductive constructions 
can also be applied inside Kozen’s mu-calculus [Ko83]. This constitutes a first small 
step towards trying to understand, over finite models, the (descriptive) complexity (in 
terms of patterns of FO and/or monadic quantifiers’ prefix) of properties definable in 
the mu-calculus. 



2 Technical Part 

2.1 Structures 

All the structures we consider in this paper are finite graphs (directed or not). The 
signature of the structures may also contain some additional unary relations (“colors”) 
and constants (s and t). 

2.2 Games 

Definition 1. 1. A pattern of a monadic game (or just pattern) is any word over the 

alphabet V, 3,0}. 

2. Ifw is a pattern then the pattern w (dual to w) is inductively defined as yv, aw, Vw, 
3w or 0w if w equals ^v, yv, 3v, \/v or 0ri respectively. The dual of the empty 
word is the empty word. 

V and a still keep the meaning of universal and existential first order quantifiers, 
while V and 3 are universal and existential monadic second order (set) quantifiers. As 
you will soon see 0 should be understood as a sort of boolean closure of a game. We 
will use the abbreviation FO for the regular expression (v 0 a). 

Definition 2. Let P and R be two relational structures over the same signature. Let w 
be some pattern. An Ehrenfeucht-Frai'sse game with pattern w over (P, R) is then the 
following game between 2 players, called Spoiler and Duplicator: 

1. If w is the empty word then the game is over and Duplicator wins if the substruc- 
tures induced in P and in R by all the constants in the signature are isomorphic. 
Spoiler wins if they are not isomorphic. 

2. Ifw is nonempty then: 

(a) If w = (w = yv) for some v then a new constant symbol c is added to 
the signature, Spoiler chooses the interpretation of c in P (R resp.) and then 
Duplicator chooses the interpretation of c in R (P resp.). Then they play the 
game with pattern v on the enriched structures. 
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(b) Ifw= 3v (w = ^v) for some v then a new unary relation symbol C is added 
to the signature, Spoiler chooses the interpretation of C in P (R resp.) and 
then Duplicator chooses the interpretation of C in R (P resp.) Then they play 
the game with pattern v on the enriched structures. 

(c) Ifw = (Bvfor some v then Spoiler can decide if he prefers to continue with the 
game with pattern v or rather with v. Then they play the game with the pattern 
chosen by Spoiler. 

The part of the game described by item (a) is called a first order round, or pebbling 
round. The part described by item (b) is a second order round, or coloring round. 

Definitions. We say that a property (i.e a class of structures) S is expressible by a 
pattern w if for each two structures P € S and R ^ S Spoiler has a winning strategy 
in the game with pattern w on {P, R). IfW is a set of patterns then we say that S is 
expressible in W if there exists aw such that S is expressible by w. 

The following theorem illustrates the links between games and logics. We skip its 
proof as well known ( see for example [EF] and [AFSOO]): 

Theorem 1. 1. Monadic NP is exactly the class of properties expressible by FO* ; 

2. The boolean closure of monadic NP is exactly the class of properties expressible by 
© 3*FO*; 

3. The first order closure of monadic NP is exactly the class of properties expressible 
by FO* © 3* FO*; 

4. 2k-th level of the monadic hierarchy is exactly the class of properties expressible by 

( 3* 

5. 2k-th level of the closed monadic hierarchy is exactly the class of properties ex- 
pressible by {FO* 3* \!*)^FO*; 

6. Closed monadic NP is exactly the class of properties expressible by {FO* 3*)*; 
The last theorem motivates: 

Definition 4. A non trivial class of game patterns (or just class) is a set of game pat- 
terns denoted by a regular expression without union over the alphabet {©, 3, V, FO}, 
which ends with FO* and contains at least one ^* or 3* 

In the sequel, all classes of game patterns we consider are non trivial. 

2.3 Graph Operations 

The techniques we are going to present are inductive and compositional. Inductive 
means here that we will assume as a hypothesis that there is a property expressible 
by some class of patterns Wi but not by W and then, under this hypothesis, we will 
prove that there is a property expressible in the class VjWi but not in the class VW 
where Vi and V will be some (short) prefixes. The word compositional means here 
that the pair of structures {Pvw, Rvw) (on which Duplicator has a winning strategy 
in a VW game) will be directly constructed from the pair of structures {PwiRw) (on 
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which Duplicator has a winning strategy in a game). For this construction we do not 
need to know anything about the original structures. 

In the sequel, we will assume that all our structures are coimected and that the 
signature contains a constant s (for source). This is possible thanks to the following 
natural definition and obvious lemma: 

Definition 5. Let S be a property of structures ( with the signature without constant 
s). Then cone{S) is the property of structures (with the same signature, enriched with 
constant s): For every x distinct from s there is an edge from s to a; and the substructure 
induced by all the vertices distinct from s has the property S. 

Lemma 1. IfS is expressible by w then cone{S) also is. IfS is not expressible by w 
then there is a pair of connected structures (P, R) (see Definition 6 below) such that P 
has the property cone{S), R does not, and Duplicator has a winning strategy in the w 
game on {P,R). ■ 

Now we introduce some notations for graph operations. As we just mentioned we 
assume that all the graphs we are dealing with are connected and have some distin- 
guished node s. Some of them will also have another distinguished node t (for target). 




Connected pair of graphs Pi and P2 



Fig. 1 . Some Graph Operations. 



Definition 6. 1. Let U denote the graph containing just two vertices, s and t, and one 

edge E{s, f). 

2. If A is a set of graphs, then Sp^^P ( Sp^^P) is the union of all graphs in A with 
all the s vertices identified (resp. and all the t vertices identified). We will use also 
the notation S^P (EpP) if A contains just c copies of the same structure P. If 
there are only two elements, say P and R in A, then we write P+R (or P+\-R) 
instead ofofSp^^P (or Ep^^P). 

3. If P is a graph with constants s and t then P.R (or PR for short) is the graph 
being a union of P and R with t of P identified with s of R (so that s of the new 
graph is the s of P and the t of the new graph is the t of R if it exists. 

4. If A is a set of graphs then the graph Ep^^(U P) will be called a connected set of 
graphs. If there are just two elements in A then we will call it a connected pair of 
graphs. 
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2.4 Some Simple Lemmas about Games 

Let us start with an obvious lemma, whieh would remain true even without the assump- 
tion that the relations introdueed during the second order rounds are unary: 

Lemma 2. If the graphs P and R are isomorphic then Duplicator has a winning strat- 
egy in the w game on {P, R) whatever w is. ■ 

The following Lemmas 3-5 are not much harder to prove that Lemma 2 but the 
assumption that games are monadic is crucial here: 

Lemma 3. If Duplicator has winning strategies in w games on (Pi,i?i) and on 
(P2,R2) then he also has winning strategies in w games on (P1-FP2, on 

{P1P2, R1R2) ond on (^Pi-\-\-P2, Ri-\-\-R2). B 

Lemma 4. For every structure P and pattern w there exists a number n such that pro- 
vided m > n then Duplicator has winning strategies in the w games on 
and{E‘flP,P^fi+,P) 

Proof Induction on the structure of w. Use the fact that for a structure P of some fixed 
size there are only finitely many colorings of it, so if we have enough copies some 
colorings must repeat many times. ■ 

Lemma 5. Let P be a connected pair of structures P\ and P2 and let Rbe a connected 
pair of structures R\ and i?2- Suppose for some (non triviaP ) class V there exists v G V 
such that Spoiler has a winning strategy on the v games on (Pi, Pi) and on (Pi, P2). 
Then there exists w G aU such that Spoiler has a winning strategy in the w games on 
(P,R). 

Proof The strategy of Spoiler is to take as his first constant the source of Pi in P. 
Duplicator must answer either with the source of Pi or of P2, and so he must make a 
commitment on which of the two structures is going to play the role of Pi in P now. The 
cases are symmetric, so let us assume he decides on Pi. Then Spoiler uses his strategy 
for the V game on (Pi, Pi) to win the game. Actually, Spoiler must force Duplicator to 
move only inside the structures Pi and Pi. This can achieved with one more coloring 
round (at any time in the v game) subsequently playing a w-game for some w G V 
since V is non trivial. The next remark makes this observation more precise. ■ 

Remark 6. After the first round, when Spoiler picks the source of P\ and Duplicator 
answers by the source of R\, Spoiler must force Duplicator to restrict the moves of the 
remaining game only to the structures P\ and R\. In other words. Spoiler needs to be 
sure that each time he picks a constant inside P\ (R\) Duplicator actually answers with 
a constant inside R\ (Pi). This can be secured with the use of an additional coloring 
round: Spoiler paints P\ (or R\, he is as happy with a 3 round as with a V one) with 
some color leaving the rest of P unpainted. Duplicator must answer by painting R\ 
(Pi) with this color, leaving the rest of R unpainted. Otherwise, this will be detected by 
Spoiler with the use of the final first order rounds. Notice that the additional coloring 
round can take place at any moment of the game, and so that the strategy is available 
for Spoiler for some aU game since V is a nontrivial class of patterns. 

^ See Definition 4 . 
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2.5 A Tool for the Boolean Closure 

Let S be any property. Then, a conneeted pair of struetures U P+UR will be called SS 
if both the structures P and R belong to S, SS if exactly one of them belongs to S and 
SS otherwise. 

Definition 7. For a property S define bool{S) as the property: the structure is a con- 
nected set of connected pairs of structures, and at least one of those pairs is SS . 



Lemma 7. Suppose a property S is not expressible in class W, but both S and its 
complement S are expressible in some other class V. Then bool{S) is not expressible 
in (BW but is expressible in aaC. 

Proof. Let us first show that there exists w G V such that, provided P G bool{S) and 
R ^ bool{S), Spoiler has a winning strategy in the aaw game on {P, R). This will prove 
that property bool{S) is expressible by aaC. 

First observe that if R is not a connected set of pairs then either the vertices of R at 
distance less than 2 from s do not form a tree, or there is a vertex at distance 2 from s 
whose degree is not 3, or i? is not connected, or there is a vertex x at distance 2 from 
s such that the structure resulting from removing x (and all the three adjacent edges) 
from R has less than 3 connected components. In each of those cases Spoiler can win 
some game in aL for every nontrivial V. 

If i? is a connected set of pairs then in his first move Spoiler takes as his constant 
the source of some SS pair in P. Duplicator must answer by showing a source of some 
pair in R. There are two cases: either Duplicator shows a source of some SS pair in R 
or a source of some SS pair in R. In each of the two cases we may think that one pair 
of structures has been selected in P and one in R. Spoiler can restrict the game to the 
two selected pairs (see Remark 6). Then we use Lemma 5 to finish the proof 

Now we will show that whatever a pattern (Bw is, where w G W, there exist two 
structures P G bool{S) and R ^ bool{S) such that Duplicator has a winning strategy in 
the (Bw game on (P, R). Let (Pi, Pi) be such a pair of structures that P\ G S, R\ S 
and Duplicator has a winning strategy in the w game on (Pi, Pi). Let c be some huge 
constant. Let R = (U{UPi+UPi)+U{URi+URi)). So P is a connected set 
of 2c connected pairs, c of them are SS and c are SS. Obviously, P ^ bool{S). Let 
P = R+U{UPi+U Pi) be P with one more pair, a SS one, so that P G bool{S). 

Now, if Spoiler in his first move decides to play the game wonP and P then remark 
that P is Q1+Q2+Q3 where Qi = (U{UPi+UPi)), Q2 = P® (U{URi+URi)) 
and Q3 = U{URi+UPi) while P is Q4+Q5+Q6 where Q4 = E^ {U{UPi+UPi)), 
Q 5 = (P(PPi+PPi)) and Qe = U{URi+URi). We know that Duplica- 

tor has a winning strategies in w games on (<5i, Q 4 ) (by Lemma 2), on {Q2, Q5) (by 
Lemma 4) and on {Q3, Qe) (by Lemma 3, since he has a winning strategy 'maw game 
on (Pi , Pi)). So, again by Lemma 3 he has a winning strategy in w game on (P, P). 

If Spoiler decides in his first round to continue with w rather than w then take 
Qi, 02,03 as before but O 4 = (P(PPi+PPi)), Os = {U{URi+URi)) 

Qe = U {U Pi+U Pi) and use the same reasoning, using the fact that Duplicator has a 
winning strategy in the w game on (Pi , Pi ) . ■ 




A Toolkit for First Order Extensions of Monadic Games 



361 



2.6 A Tool for First Order Quantifiers 

Now the signature of our structures will contain additional unary relation symbol G 
(for gate). For a given structure P, and for two its vertices x, y, such that G{y) holds 
let Px.y be the structure consisting of the connected component of P ~{x}, containing 
y as its source. P — {x} is here understood to be the structure resulting from P after 
removing x and all its adjacent edges. So Px,y could be read as ’’the structure you enter 
from X crossing the gate y” (see Figure 2). 




Fig. 2. Px,y Is the Structure You Enter from x Crossing the Gate y. 



Definition 8. Let S be some property of structures. Then reach( S ) will be the following 
property (of a structure P): there is a path from s to t such that for every x on this path 
it holds that (i) x ^ G and (ii) for every y such that E{x, y) and G{y) the structure Px,y 
has the property S. 

By a path from stotwe mean a subset P[ of the set of vertices of the structure such 
that s,t G H, each of s and t has exactly one adjacent vertex in P[ and each element 
of H which is neither s nor t has exactly 2 adjacent vertices in H. The fact that iF is a 
path is expressible by FO*. 

Lemma 8. 1. Suppose a property S is not expressible in some class W. Then reach( S ) 

is not expressible in FO*W; 

2. Suppose a property S is expressible in some class W. Then reach( S ) is expressible 
in the class 3vvIF. 

Proof. 1 . First of all we will show that if S is not expressible in W, then also reach(5) is 
not expressible in W. For a given w GW there are structures P and R such that P G S, 
R S and Duplicator has a winning strategy in the w game on (P,R). Consider a struc- 
ture T whose only elements are s, t, x, y, whose edges are F{s, x),F{x, t),E{x, y) and 
for which G{y) holds. Let Pq be the union of T and P, with y of T identified with s of 
P. The s and t of Pq are s and t of T. Let Rq be the structure constructed in the same 
way from T and R. Then obviously Pq G reach{S), Pq ^ reach{S) and Duplicator 
has a winning strategy in the w game on (Pq, Pq)- Notice that both Pq and Pq have the 
following property : 

(*) (property of structure Q) if x is reachable from s or from t by a path disjoint 
from G and if y is such that G{y) and E{x, y) then Qxy contains neither s of Q nor t 
of Q. 
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Now let P and R be structures, both satisfying (*) and such that P G reach{S), 
R ^ reach{S) and Duplicator has a winning strategy in a n; game on (P, R). In order 
to prove our claim it is enough (by induction) to construct structures (Pi,i?i) both 
satisfying (*) and such that Pi G reach{S), R\ ^ reach{S) and Duplicator has a 
winning strategy in a vain game on (Pi, Pi). Let n be a huge enough constant. Define: 
Pi = {E^{PR))+\-{E^*{RP)) and Pi = Pi-H-PP. Obviously Pi G reach{S) and 
Pi ^ reach{S) hold. Now will show a winning strategy for Duplicator in a vain game 
on (Pi , Pi). In his first round Spoiler selects some constant in Pi. Duplicator answers 
with the same constant in Pi (this is possible since Pi can be viewed as a subset of Pi). 
Now notice that after this first round Pi can be seen as 

pp^pp^(r«‘_i(pp))+G(r«‘_i(pp)) 

and Pi as 

PP4LPP4L ( 1 (PP) ) +f ( 1 (PP) ) 4LPP 

where the constant selected in the first round is in the first PP+bPP, both in Pi and in 
Pi . By Lemma 2 and Lemma 3 it is now enough to show that Duplicator has a winning 
strategy in the remaining 3w game on (Pj, P 2 ) where 

P2 = r:‘_i(pp))^(r:‘_i(pp))^pp 

and 

P2 = r:‘_i(pp))^(r:*_i(pp)) 

Let Spoiler select some constant in P 2 . 

If Spoiler selects a constant in 27 ^*_i(PP))+l-( 27 ^*_i(PP)) then Duplicator an- 
swers with the same constant in P 2 and then wins easily. The only interesting case 
is when Spoiler selects his constant in PP. Suppose it is selected in the first P (the 
other case is symmetric). Then Duplicator answers by selecting the same constant in 
the P of some PP in P2. Notice that P2 = ( 5 i-H-( 52 +l-( 27 **_^(PP)) and P2 = 
Q 3 +hQi^{E^J_i{RP)), where Qi = PP, Q 2 = E^J_i{PR)), Q 3 = PR and 
Q4 = S^_2{PR)), and where some constant is already fixed in the first P of Q\ 
and in the P of Q 3 . Now the w game remains to be played. But since Duplicator has a 
winning strategy in the w game on (P, P) he also has (by Lemmas 2 and 3) a winning 
strategy in aw game on (Qi , Q 3 ). By Lemma 4 he has a winning strategy in a ru game 
on {Q2, Q4) and so, again by Lemma 3 we get a winning strategy for Duplicator in the 
aw game on (P2, P2)- 

2. Suppose P G reach{S) and P ^ reach{S). Spoiler, in his first move fixes a path 
in P, as in the definition of reach{S). Duplicator answers selecting a set in P. If the 
set selected by Duplicator is not a path from s to f then Spoiler only needs some fixed 
number of first order rounds to win. If it is such a path then there must be some x on the 
path, and some y such that E{x, y), G{y) hold in P and Rx,y ^ S. Now Spoiler uses 
his two first order universal rounds to fix those x and y. Duplicator answers with some 
two points z,t in P such that E{z, t) and G{t) hold in P. But, since P G reach{S) it 
turns out that P^ t G S, so Spoiler can use rounds of the remaining w game to secure 
a win (a trick from Remark 6 will be needed here to restrict the w game to Px,y, Rz,t)- 
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Remark 9. The role of predicate G is not crucial for the construction above. It could be 
replaced by a graph gadget if the reader wishes to see V 2 being a property of undirected 
uncolored graphs. 

Another way to avoid the unary relation G (as suggested by Larry Stockmeyer) is to 
define reach(S) as: there is a path from s to t such that for every x on this path and 
every y such that E(x, y) and y is not on this path, the structure Px,y has the property 
S. 

2.7 Corollaries 

As the first application of our toolkit we reprove the results from [AFSOO]: 

Theorem 2. There exists property Vi expressible in FO* 3* FO* but not in 0 3 ,* FO* . 
There exists property V 2 expressible in 3FO* 3* FO* but not in FO* (B 3* FO*. 

Proof. Let Cted be the property of connectivity. It is well known that Cted is not 
expressible in 3* FO* but both Cted and its complement are expressible in vv 3*FO* . 
now take Vi = bool (cone{C ted)) and V 2 = reach{bool{cone{Cted))). Use Lemmas 
7 and 8 to finish the proof ■ 

A new result we can prove is that even if the hierarchy inside closed monadic NP 
collapses, it does not collapse on a first order level: 

Theorems. If there is a property expressible in FO*W but not in W, where W = 
( 3* FO*)^ then there is a property expressible in 3FO*W but not in F O* W. 

Proof. This follows immediately from Lemma 8 ■ 

Several similar results can be proved for the closed monadic hierarchy or reproved for 
the monadic hierarchy (see [MT97] and [Ma99] sections 4.4 and 4.5). 

It is interesting to remark that the inductive constructions presented here are also de- 
finable (with minor and insignificant variations) inside Kozen’s propositional /i-calculus 
[Ko83]. 

More precisely, given some unary predicates S, one may define in the /i-calculus 
the new predicates that depend on S\ Bool(S) = 0{0S A O^S) and Reach(S) = 
p.X.(0(G ^ S) A (OX V T)) which almost denote the same constructions (here 
the “targef ’ constant t is replaced by the set of “possible targets” T and the “source” 
constant s is the implicit free FO variable in any mu-calculus formula). 

From Lemmas 7 and 8 (which extend to these definitions inside the mu-calculus) 
and the fact that (the mu-calculus version of) directed reachability: dreach = p,X. 
(OAT V T) is not expressible in 3* FO* while both dreach and its complement are 
expressible in 3v 3*FO*, one has: 

Corollary 1. There are properties 1Z\ and IZ 2 definable in monadic p-calculus such 
that IZi is expressible in FO* 3FO* 3* FO* but not in 0 3* FO* and IZ 2 is express- 
ible in 3FO* 3FO* 3*FO* but not in FO* 0 3* FO*. 

Proof. Take IZ\ = Bool(dreach) and 7^2 = Reach(dreach) and apply Lemmas 7 
and 8 to finish the proof ■ 
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Abstract. The Max-Bisection and Min-Bisection problems are to find 
a partition of the vertices of a graph into two equal size subsets that 
respectively maximizes or minimizes the number of edges with endpoints 
in both subsets. 

We design the first polynomial time approximation scheme for the Max- 
Bisection problem on arbitrary planar graphs solving a long time stand- 
ing open problem. The method of solution involves designing exact poly- 
nomial time algorithms for computing optimal partitions of bounded 
treewidth graphs, in particular Max- and Min-Bisection, which could be 
of independent interest. 

Using similar method we design also the first polynomial time approx- 
imation scheme for Max-Bisection on unit disk graphs (which could be 
easily extended to other geometrically defined graphs) . 



1 Introduction 

The max-bisection and min-bisection problems, i.e., the problems of constructing 
a halving of the vertex set of a graph that respectively maximizes or minimizes 
the number of edges across the partition, belong to the basic combinatorial 
optimization problems. 

The best known approximation algorithm for max-bisection yields a solu- 
tion whose size is at least 0.701 times the optimum [16] whereas the best known 
approximation algorithm for min-bisection achieves “solely” a log-square approx- 
imation factor [11]. The former factor for max-bisection is considerably improved 
for regular graphs to 0.795 in [10] whereas the latter factor for min-bisection is 
improved for graphs excluding any fixed minor (e.g., planar graphs) to a logarith- 
mic one in [11]. For dense graphs, Arora, Karger and Karpinski give polynomial 
time approximation schemes for max- and min-bisection in [2] . 

In this paper, we study the max-bisection and min-bisection problems on 
bounded treewidth graphs and on planar graphs. Both graph families are known 
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to admit exact polynomial time algorithms for max-cut, i.e., for finding a bi- 
partition that maximizes the number of edges with endpoints in both sets in the 
partition [9,14]. 

Our first main result are exact polynomial time algorithms for finding a par- 
tition of a bounded treewidth graph into two sets of a priori given cardinalities, 
respectively maximizing or minimizing the number of edges with endpoints in 
both sets. Thus, in particular, we obtain polynomial time algorithms for max- 
bisection and min-bisection on bounded treewidth graphs. 

The complexity and approximability status of max-bisection on planar graphs 
have been long-standing open problems. Contrary to the status of planar max- 
cut, planar max-bisection has been proven recently to be NP-hard in exact set- 
ting by Jerrum [17]. Karpinski et al. observed in [18] that the max-bisection 
problem for planar graphs does not fall directly into the Khanna-Motwani’s syn- 
tactic framework for planar optimization problems [19]. On the other hand, they 
provided a polynomial time approximation scheme (PTAS) for max-bisection in 
planar graphs of sublinear maximum degree. (In fact, their method implies that 
the size of max-bisection is very close to that of max-cut in planar graphs of 
sublinear maximum degree.) 

Our second main result is the first polynomial time approximation scheme 
for the max-bisection problem for arbitrary planar graphs. It is obtained by 
combining (via tree-typed dynamic programming) the original Baker’s method 
of dividing the input planar graph into families of /c-outerplanar graphs [4] with 
our method of finding maximum partitions of bounded treewidth graphs. 

Note that the NP-hardness of exact planar max-bisection makes our PTAS 
result best possible under usual assumptions. 

Interestingly, our PTAS for planar max-bisection can be easily modified to a 
PTAS for the problem of min-bisection on planar graphs in the very special case 
where the min-bisection is relatively large, i.e., cuts l7(nloglogn/logn) edges. 

Unit disk graphs are another important class of graphs defined by the ge- 
ometric conditions on a plane. An undirected graph is a unit disk graph if its 
vertices can be put in one to one correspondence with disks of equal radius in 
the plane in such a way that two vertices are joined by an edge if and only if the 
corresponding disks intersect. Tangent disks are considered to intersect. 

Our third main result is the first polynomial time approximation scheme 
for the max-bisection problem on unit disk graphs. The scheme can be easily 
generalized to include other geometric intersection graphs. It is obtained by 
combining (again via tree-typed dynamic programming) the idea of Hunt et al. 
of dividing the input graph defined by plane conditions into families of subgraphs 
[15] with the aforementioned known methods of finding maximum partitions of 
dense graphs [2]. 

The structure of our paper is as follows. The next section complements the 
introduction with basic definitions and facts. In Section 3, the algorithms for 
optimal partitions of bounded treewidth graphs are given. Section 4 presents 
the PTAS for planar max-bisections. In Section 5, we make several observations 
on the approximability of planar min-bisection. Finally, Section 6 describes the 
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PTAS for max-bisection on unit disk graphs. In conclusion we notice that same 
technique can be applied also for other geometric intersection graphs. 

2 Preliminaries 

We start with formulating the underlying optimal graph partition problems. 

Definition 1. A partition of a set of vertices of an undirected graph G into two 
sets X, Y is called an (|AT|, |F|)-partition of G. The edges of G with one endpoint 
in X and the other in Y are said to be cut by the partition. The size of an {I, k)- 
partition is the number of edges which are cut by it. An {I, k) -partition of G is 
said to be a maximum (I, fc)-partition of G if it has the largest size among all 
{I , k) -partitions ofG. An {I , k) -partition of G is a bisection if I = k. A bisection 
of G is a max bisection or a min bisection of G if it respectively maximizes or 
minimizes the number of cut edges. An {I, k) -partition of G is a max cut of G if 
it has the largest size among all {V ,k') -partitions ofG. The max-cut problem is 
to find a max cut of a graph. Analogously, the max-bisection problem is to find 
a max bisection of a graph. The min-cut problem and the min-bisection problem 
are defined analogously. 

The notion of treewidth of a graph was originally introduced by Robertson 
and Seymour [21]. It has turned out to be equivalent to several other interesting 
graph theoretic notions, e.g., the notion of partial k-trees [1,5]. 

Definition 2. A tree-decomposition of a graph G = (V,E) is a pair {{Xi \ i e 
/},T = (I,F)), where {Xi \ i G 1} is a collection of subsets ofV, andT = (I,F) 
is a tree, such that the following conditions hold: 

2. For all edges (v,w) G F, there exists a node i G I, with v,w G Xi. 

3. For every vertex v G V, the subgraph of T, induced by the nodes {i G I \ v G 
Xi\ is connected. 

The treewidth of a tree- decomposition {{Xi \ i G I},T = {I,F)) is maxjg/ \Xi\ — 
1. The treewidth of a graph is the minimum treewidth over all possible tree- 
decompositions of the graph. A graph which has a tree- decomposition of treewidth 
0(1) is called a bounded treewidth graph. 

Fact 1[6]: For a bounded treewidth graph, a tree decomposition of minimum 
treewidth can be found in linear time. 

To state our approximation results on max-bisection we need the following 
definition. 

Definition 3. A real number a is said to be an approximation ratio for a max- 
imization problem, or equivalently the problem is said to be approximable within 
a ratio a, if there is a polynomial time algorithm for the problem which al- 
ways produces a solution of size at least a times the optimum. If a problem is 
approximable for arbitrary a < 1 then it is said to admit a polynomial time 
approximation scheme (a PTAS for short). 
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An approximation ratio and a PTAS for a minimization problem are defined 
analogously. 

2.1 Optimal Partitions for Graphs of Bounded Treewidth 

Let G be a graph admitting a tree-decomposition T = (I,F) of treewidth at 
most k, for some constant k. By [9], one can easily modify T, without increasing 
its treewidth, such that one can see T as a rooted tree, with root r G I, fullfiling 
the following conditions: 

1. T is a binary tree. 

2. If a node i G I has two children ji and j 2 , then Xi = Xj^ = Xj^. 

3. If a node i G I has one child j, then either Xj C Xi and |Aj \ Aj| = 1, or 

A, C Xj and \Xj \ A,| = 1. 

Provided a tree-decomposition of width k is given such a modified tree- 
decomposition of the same width can be constructed in linear time whereby 
the new decomposition-tree has at most 0(|P(G)|) nodes. We will assume in the 
remainder that such a modified tree-decomposition T of G is given. 

For each node i G I, let Yi denote the set of all vertices in a set Xj with j = i 
or j is a descendant of i in the rooted tree T. Our algorithm computes for each 
i G I, an array maxpi with 0{2^\Yi\) entries. For each I G {0, 1, ..., \Yi\} and each 
subset S of Xi, the entry maxpi{l,S) is set to rnax 5 /cY'i,|S'|=i,S'nXi=s |{(^)'*^) G 
E\v G S' k w G Yi\ 5"}|. In other words, maxpi{l,S) is set to the maximum 
number of cut edges in an (/, \ Yi\ — l)-partition of Yi where S and Xi\S are in 
the different sets of the partition and the set including S is of cardinality 1. For 
convention, if such a partition is impossible, maxpi{l, S) will be set to — oo. 

The entries of the array are computed following the levels of the tree-decom- 
position T in a bottom-up manner. The following lemma shows how the array 
can be determined efficiently. 

Lemma 1. 



— Let i he a leaf in T. Then for all I G {0, 1, ..., |Ai|} and S C Xi where [S'! = I, 
maxpi{l,S) = |{(v,'u;) G E\v G S,w G \ S'}!. The remaining entries of 
maxpi are set to — oo. 

— Let i be a node with one child j in T. Lf Xi C Xj then for all I G {0, 1, ..., |Fi|} 
and S C Xi, maxpi{l,S) = maxs'(zXj,S'nXi=s 'maxpj{l. S'). 

— Let i he a node with one child j in T. Lf Xj\j{v} = Xi where v ^ Xj then for 
all I G {0, 1, ..., |Ai|} and S C Aj, ifv G S then maxpi{l, S) = maxpj{l—l, S\ 
{v}) -I- |{(v, s)|s G Xi \ A}! else maxpi{l, S) = maxpj{l, S) + |{(u, s)|s G A}!. 

— Let i be a node with two children j\ , j 2 in T, with Xi = Xj^ = Xj.^ . For all 
I G {0, 1, ..., |y*|} and S C Xi, maxp^{l, S) = max;^+; 2 -|S|=i&Zi>|S|&i 2 >|S| 
{maxpj^ {l\, S) + maxpj.^{l 2 , S) — |{(u, w) G E\v G S,w G Xi \ A}!). 

It follows that computing an array maxpi on the basis of the arrays computed 
for the preceding level of T can be done in time 0(2*|Fip). Consequently, one 
can compute the array maxpr for the root r of T in cubic time. 
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Theorem 1. All maximum {I , n — 1) -partitions of a graph on n nodes given with 
a tree- decomposition of treewidth k can be computed in time 0{2^n^). 

By substituting min for max, we can analogously compute all minimum 
{I, n — /)-partitions of a graph with constant treewidth. 

Theorem 2. All minimum {I, n — 1) -partitions of a graph on n nodes given with 
a tree- decomposition of treewidth k can be computed in time 0(2^n^). 

By Fact 1 we obtain the following corollary. 

Corollary 1. All maximum and minimum {l,n — 1) -partitions of a bounded 
treewidth graph on n vertices can be computed in time 0{n^). 

Since a tree-decomposition of a planar graph on n vertices with treewidth 
0{y/ri) can be found in polynomial time by the planar separator theorem [7], we 
obtain also the following corollary. 

Corollary 2. All maximum and minimum {I , n — 1) -partitions of a planar graph 
on n vertices can be computed in time 

3 A PTAS for Max-Bisection of an Arbitrary Planar 
Graph 

The authors of [18] observed that the requirements of the equal size of the vertex 
subsets in a two partition yielding a max bisection makes the max-bisection 
problem hardly expressible as a maximum planar satisfiability formula. For this 
reason we cannot directly apply Khanna-Motwani’s [19] syntactic framework 
yielding PTASs for several basic graph problems on planar graphs (e.g., max 
cut). Instead, we combine the original Baker’s method [4] with our algorithm 
for optimal maximum partitions on graphs of bounded treewidth via tree-type 
dynamic programming in order to derive the first PTAS for max-bisection of an 
arbitrary planar graph. 

Algorithm 1 

input: a planar graph G = {V, E) on n vertices and a positive integer k; 
output: (1 — ^^)-approximations of all maximum (l,n — /)-partitions of G 

1 . Construct a plane embedding of G; 

2. Set the level of a vertex in the embedding as follows: the vertices on the outer 
boundary have level 1, the vertices on the outer boundary of the subgraph 
obtained by deleting the vertices of level i — 1 have level i, for convention 
extend the levels by k empty ones numbered -k-\-l, -k-\- 2, ...,0; 

3. For each level j in the embedding construct the subgraph Hj of G induced 
by the vertices on levels j,j-\-l,...,j-\-k', 

4. For each level j in the embedding set n' to the number of vertices in Hj and 
compute all maximum {l,n'j — Z)-partitions of Hj] 
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5. For each i, 0 < i < k, set Gi to the union of the subgraphs Hj where j 
(mod k + 1) = i] 

6. For each i, 0 < i < k, set rii to the number of vertices in Gi and compute 
all maximum {l,rii — /(-partitions of Gi by dynamic programming in a tree 
fashion, i.e., first compute all maximum partitions for pairs of “consecutive” 
Hj where j (mod A: -I- 1) = i, then for quadruples of such Hj etc.; 

7. For each I, 1 < I < n, output the largest among the maximum {l,n — l)- 
partitions of Gi, 0 < i < k. 



Lemma 2. For each I, 1 < I < n, Algorithm 1 outputs an {l,n — l)-partition of 
G within k/{k + 1) of the maximum. 

Proof. Let P be a maximum (/, n — /(-partition of G. For each edge e in P, there 
is at most one i, 0 < i < k, such that e is not an edge of Gi. Consequently, 
there is i' , 0 < i' < k, such that Gr does not include at most \P\/{k + 1( edges 
of P. It follows that a maximum {l,n— /(-partition of such a Gr cuts at least 
k\P\/{k + 1( edges. Algorithm 1 outputs an {l,n — /(-partition of G cutting at 
least so many edges as a maximum (/, n — /(-partition of Gr . □ 

Lemma 3. Algorithm 1 runs in 0{k2^^~^n^) time. 

Proof. The time complexity of the algorithm is dominated by that of step 4 and 

6 . 

The subgraphs Hj of G are so called fc-outerplanar graphs and have bounded 
treewidth 3fc — 1 [7]. Hence, for a given i, 0 < i < k, all maximum {l,n'j — /(- 
partitions of Hj where j (mod fc -I- 1( = i can be computed in time 0(2^^“^n^( 
by Lemma 1, the pairwise disjointness of the subgraphs and j < n. It follows 
that the whole step 4 can be implemented in time 0{k2^^~^ n^) . 

In step 6, a maximum (/, Ui — /(-partition of the union of 2"?+^ “consecutive” 
Hj's satisfying j (mod fc-l- I( = i can be determined on the basis of appropriate 
maximum partitions of its two halves, each being the union of 2"? of the Hj's, 
in time 0{n). Hence, since / < Ui and the number of nodes in the dynamic 
programming tree is 0{n), the whole step 6 takes 0{kn^) time. □ 



Theorem 3. Algorithm 1 yields a PTAS for all maximum {l,n — 1) -partitions 
of a planar graph. 



Corollary 3. The problem of max-bisection on planar graphs admits a PTAS. 

4 Observations on Min-Bisection for Planar Graphs 

We can easily obtain an analogous PTAS for min-bisection of planar graphs in 
the very special case when the size of min-bisection is I7(n(. Simply, at least one 
of the subgraphs Gi of G misses at most \E\/{k -\- 1( edges of G. Therefore, the 
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number of edges cut by a min-bisection of such a Gi can increase at most by 
|_E|/(fc-|- 1) in G. By picking k sufficiently large we can guarantee an arbitrarily 
close approximation of min-bisection in G. 

In fact, we can obtain even a slightly stronger result on min-bisection for 
planar graphs by observing that our method runs in polynomial time even for 
non-constant k (up to O(logn)) provided that a tree-decomposition of graphs 
with treewidth equal to such a k can be determined in polynomial time. At 
present, the best tree-decomposition algorithms have the leading term k^ [8] so 
we can set k to 0(logn/loglogn) keeping the polynomial time performance of 
our method. In this way, we obtain the following theorem. 

Theorem 4. The min-bisection problem on planar graphs in which the size of 
min-bisection is l7(nloglogn/logn) admits a PTAS. 

Observe that the presence of large degree vertices in a planar graph can cause 
the large size of min-bisection, e.g., in a star graph. For bounded-degree planar 
graphs the size of min-bisection is 0{y^n) by the following argument. 

For a planar graph of maximum degree d construct a separator tree by applying 
the planar separator theorem [20] recursively. Next, find a path in the tree from 
the root down to the median leaf. By deleting the edges incident to the vertex 
separators along the path and additionally 0(1) edges, we can easily halve the 
set of vertices of the graph such that none of the remaining edges connects a 
pair of vertices from the opposite halves. The number of deleted edges is clearly 
0{d^/n). In fact, we do not have to construct the whole separator tree, but just 
the path, and this can be easily done in time 0(n log n) [20]. 

Theorem 5. For a planar graph on n vertices and maximum degree d, a bisec- 
tion of size 0{d\/n) can be found in time 0(n log n). 

Clearly, if a graph has an 0(l)-size bisection, it can be found by exhaus- 
tive search in polynomial time. We conclude that at present we have efficient 
methods for at least 0(l)-approximation of min-bisection in planar graphs if its 
size is either I2(nloglogn/logn) or 0(1), or 0{^/n) and the maximum degree is 
constantly bounded. These observations suggest that a substantial improvement 
of the logarithmic approximation factor for min-bisection on planar graphs given 
in [11] might be possible. 

5 PTAS for Max-Bisection of a Unit Disk Graph 

In this section we design a PTAS for max-bisection of unit disk graphs, another 
important class of graphs defined by the geometric conditions on a plane. 

Recall that an undirected graph G is a unit disk graph if its vertices can be 
put in one to one correspondence with disks of equal radius in the plane in such 
a way that two vertices are joined by an edge if and only if the corresponding 
disks intersect. Tangent disks are considered to intersect. We may assume w.l.o.g 
that the radius of each disk is one. Since the recognition problem for unit disk 
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graph is NP-hard, we shall also assume that a geometric representation of the 
graph is given as input. 

Our technique works in a similar way as in the case for planar graphs. The 
input graph G is divided into families of subgraphs Hij using the ideas of Hunt 
et al. given in [15]. Next, approximative solution to all — Z)-partitions 

of every subgraph Hij, where riij denotes the number of vertices in Hij, are 
computed by the methods given in [2]. Via a tree-type dynamic programming 
these solutions are used to obtain an overall solution for G. 

In order to divide the graph G, we impose a grid of horizontal and vertical 
lines on the plane, that are 2 apart of each other. The u-th vertical line, — oo < 
V < oo, is at X = 2v. The h-th horizontal line, — oo < h < oo, is at j/ = 2h. We 
say, that the w-th vertical line has index v and that the h horizontal line has 
index h. Further we denote the vertical strip between the w-th and the {v + 1)- 
th vertical line as the strip with index v and analogue for the horizontal strip 
between the h-th and the {h l)-th horizontal line. 

Each vertical strip is left closed and right open, each horizontal strip is closed 
at the top and open at the bottom. A disk is said to lie in a given strip if its 
center lies in that strip. Note that every disk lies in exactly one horizontal and 
vertical strip. 

For a fixed k consider the subgraph Hi j of G, — oo < i,j < oo, induced by 
the disks that lie in the intersection of the horizontal strips i,i + k 

and the vertical strips j,j 1, . . . , j -I- fc. Let riij be the number of vertices of 
Hi j. By a packing argument it can be shown that for fixed k > 0, the size of a 
maximum independent set of such a subgraph is at most 2(k 3)^7t. 

Lemma 4. There is a positive constant c such that if Uij > clogn then the 
subgraph Hij of G is dense. 

Proof. Partition the vertex-set of Hi j successively into maximal independent 
sets by determining a maximal independent set Ji , remove its vertices and again 
determine a maximal independent set I 2 and so on. As described above the 
number of independent sets is at least nij/2{k 3)^7t. Since each Ij is maximal 
there is at least one edge from a vertex of Ij to every Iji , j < j' . If we understand 
the set of independent sets as a complete graph on mj/2{k 3)^7t vertices it 
follows that Hij has Q{nfj) edges and hence Hij is dense. □ 

Corollary 4. If Ui j > c log n then the size of a maximum bisection of Hi j is 

Proof. Partition the vertex-set of Hi j as before and use the maximum indepen- 
dent sets to build up the sets of the bisection. Since all independent sets are 
maximal there are edges between the sets of bisection. □ 

Consequently the techniques given in [2] are applicable to the subgraph Hij. 

Algorithm 2 

input: a unit disk graph G = (V,E) specified by a set V of disks in the plane 
and the coordinates of their centers and a positive integer k; 
output: (1 — )(1 ~ 5)-approximations of maximum bisection of G 
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1. Divide the plane by imposing a grid of width two; 

2. Construct the subgraphs Hij of G as described above; 

3. For each i and each j set n' ^ to the number of vertices in Hij and compute 
all {l,nG — l)-partitions of Hij either approximatively or optimal if n' ^ = 
0(log n); 

4. For each r and s, 0 < r, s < k, set Gr,s to the union of the subgraphs Hij 
where i (mod fc -|- 1) = r and j (mod fc -I- 1) = s; 

5. For each r and s, 0 < r, s < fc, set nr,s to the number of vertices in Gr,s and 
compute a bisection of Gr,s within (1 — <5) of its maximum by dynamic pro- 
gramming in a tree fashion. Therefore enumerate the subgraphs in increasing 
order of the sum i + j and compute all partitions of pairs of “consecutive” 
Hi j respectively to this ordering on the basis of the computed partitions, 
then for quadruples of such Hij etc.; 

6. Output the largest bisection of Gr,s, 0 < r,s < k. 

If riij < clogn we can find all the maximum {l,riij — /)-partitions of the 
subgraph Hij in polynomial time by enumerating all possibilities. Otherwise the 
problem is solvable approximatively in polynomial time by solving the following 
polynomial integer program: 

maximize ~ ^i) + ~ ^i) (1) 

subject to '^Xi = l, (2) 

XiG{0,l} i = {I, . . . ,Uij) (3) 

This program can be solved by the use of Theorem 1.10 given in [2] within an 
error of at most erif j, which also satisfies the linear constraint (2) of the program 
within an additive error of 0{e^riij log riij). In order to get a subset of size I we 
move at most Cytriij log iiij in or out. This affects the number of edges included 
in the partition by at most enij^Jriij log riij < enlj. Hence we can compute 
a maximum {l,riij — /(-partition of a subgraph Hij that has more than clogn 
vertices within an additive error of 2en? of the maximum. 

^iJ 

Lemma 5. Algorithm 2 outputs a bisection of G within (1— j;^)^(l — <5) of the 
maximum. 

Proof. Let P be a maximum bisection of G. For each edge e G P and a fixed r, 
0 < r < k, there is at most one s, 0 < s < fc, such that e crosses a vertical line 
whose index modulo fc-l- 1 is s. Analogously, there is for each e G P and a fixed s, 
0 < s < fc, at most one r, 0 < r < k, such that e crosses a horizontal line whose 
index modulo fc -I- 1 is r. Consequently there is a pair (r, s), 0 < r,s < k, such 
that a maximum {l,n— /(-partition of Gr,s cuts at least (1 — -^^A\P\ edges. 

By Corollary 4, the size of maximum bisection of the subgraph G(, ^ of Gr,s 
that consists of all Hij with more than clogn vertices is .^^logn 
Consequently, the error caused by the solutions of the polynomial integer pro- 
grams for the subgraphs Hij of are at most a, 6 = 2e fraction of an optimum 
solution of maximum bisection for G(.^. Since the partitions for each Hi j with 
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at most clogn vertices are computed optimally, we obtain a bisection of Gr,s 
within (1 — 5) of the maximum. 

Thus algorithm 2 outputs a bisection of G within (1 — — S) of the max- 
imum. □ 



Theorem 6. The problem of max-bisection on unit disk graphs admits a PTAS. 

The same approach can be used to obtain a PTAS for the maximum bisection 
problem in geometric intersection graphs both of other regular polygons and also 
of regular geometric objects in higher dimensions. 
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Abstract. We show that the families (fc, r)-RBC of languages accepted 
(in quasi-realtime) by one-way counter automata having k blind counters 
of which r are reversal-bounded form a strict and linear hierarchy of semi- 
AFLs. This hierarchy comprises the families BLIND — Afn(Ci) of blind 
multicounter languages with generator C\ ~ {w £ {ar,?>i}* | |w|^ = 
} and RBC = Afn(Bi) of reversal-bounded multicounter languages 
with generator Bi := {a" 6" | n £ IV}. This generalizes and sharpens the 
known results from [Grei 78] and [Jant 98]. 



1 Introduction 

Hierarchies of counter automata are often proved by arguments concerning the 
dimension of the memory space, i.e., the number of counters, see for example 
[FiMR 68, Grei 76, Grei 78] or counting cycles within the computations, as in 
[Hrom 86]. If one does not alter the dimension, and changes only the strategy 
of accessing the counters, other methods have to be found. The method applied 
here for the first time uses techniques from linear algebra and shows that the 
formerly known two hierarchies of blind and of reversal-bounded multicounter 
languages are in fact part of one linear hierarchy of semi-AFLs. 

The family of languages accepted by one-way reversal-bounded multicounter 
automata (in quasi-realtime) is a well known semi-AFL which is principal as an 
intersection-closed semi-AFL Mn{Bi) with generator Bi := {a"&” | n G IN}, 
which is not a principal semi-AFL, see [FiMR 68, Grei 78]. 

The known situation for these hierarchies, shown in [Grei 78], is as follows: 
= Mn(C'i) = BLIND = Ui>i-A^(^i) = Mn{Bi) = RBC, where 
Ni{C) denotes the least trio generated by the family C, which is a semi-AFL 
if £ = {L} and then we write A4(L) instead of N4{£). For all t > 1 we 
have M.{Bi) ^ AI(i?i+i), see [Gins 75], and M{Ci) ^ M{Ci+i), shown in 
[Grei 76, Grei 78]. (For the definition of the languages Bi and Ci see Definition 
2.1 below). 

We study the families {k,r)-RBC of languages accepted (in quasi-realtime) 
by one-way (or on-line) counter automata having k blind counters of which 
r < k are reversal-bounded and prove {ki,ri)~RBC ^ {k 2 ,r 2 )-RBC if and 
only if k\ < k 2 or ki = k 2 and ri > V 2 - Then {k,0)-RBC = A4(Ck), and 
[Ji,y^{k,0)-RBC = Ui>i-^(C'i) = Aln(C'i) forms a hierarchy of twzst-closed 
semi-AFLs (see [Jant 98]). The strict inclusions are proved here for the first 
time. 



A. Ferreira and H. Reichel (Eds.): STAGS 2001, LNCS 2010, pp. 376—387, 2001. 
@ Springer- Verlag Berlin Heidelberg 2001 
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2 Basic Definitions 

Definition 1. For any alphabet U and x £ E let \w\^ denote the number of 
occurrences of the symbol x within the string w € E*, |w| := 
if : E* — > W" is the Parikh mapping, defined by if{w) := (|r(;|a;j , . . . , 
where n := The empty word is denoted by A and ip{X) = 0 G JV" is the 

vector, all of whose coordinates are 0. 

The languages we use here are constructed using the specific alphabet F„ 
specified for each n G IN, n > 1 by: F„ := {oi, bi \ 1 < i < n}, and the homomor- 

phisms hi defined for i > 1 by: hi{x) := 



Cn := 


{w e r* 


V 1 < t < n : . = 


} 


Bn := 


{wGCl 


V 1 < i < n : hi{w) = 


af^bf^, for some m G IN 


Dn := 


e r* 


1 VI < t < n : (|u;|„. = 


[■u;|^. A Vic = uv : \u\ai > 



The language D\ defined above is the so-called semi-Dyck language on one 
pair of brackets which is often abbreviated by D'\, see e.g. [Bers 80]. here 
denotes the n-fold shuffle of disjoint copies of the semi-Dyck language D\ and 
it is known, [Grei 78,Jant 79], that Ui>i-^(^*) = Nin{Di) = PBLIND{n). 
The latter family consists of languages accepted in quasi-realtime by nondeter- 
ministic one-way multicounter acceptors which operate in such a way that in 
every computation no counter can store a negative value, and the information 
on whether or not the value stored in a counter is zero is not used for deciding 
the next move. 

The languages C„ are the (symmetric) Dyck languages on n pairs of brackets 
Oi,bi, often abbreviated by D*, see again [Bers 80]. Greibach, [Grei 78], has 
shown that \Ji>^M{Ci) = Mn{Ci) = BLIND = BLIND{lin) = BLIND{n) = 
Ui>i =~Mn{Bi) = RBC{n) = RBC $ PBLIND. 

Here BLIND {BLIND (n), BLIND (lin)) denotes the family of languages ac- 
cepted (in quasi-realtime, linear time, resp.) by nondeterministic one-way mul- 
ticounter acceptors which operate in such a way that in every computation all 
counters may store arbitrary integers, and the information on the contents of the 
counters is not used for deciding the next move. The family RBC is the family 
of languages accepted by nondeterministic one-way multicounter acceptors per- 
forming at most one reversal in each computation. The formal definition is to 
be found in Section 3. 



( X, ifxG{ai,bi} 
\ A, else 



3 Blind k Counter Automata with r < k 
Reversal-Bounded Counters. 

We shall deal only with counter-automata that have a one-way read-only in- 
put tape (also known as on-line automata) and have fc-blind counters of which 
precisely r counters are reversal-bounded. 
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Definition 2. A blind k-counter automaton M := {Q, E,S,qo,Qfin) consists 
of a finite set of states Q, a designated initial state qo € Q, a designated set 
of final states Qfin Q Q, a finite input alphabet E, and a transition function 
S-.Qx (i:u{A}) ^ 

An instantaneous description (ID) of M is an element of Q x E* x . We 
write {qi,aw,zi,...,Zk)ijf{q 2 ,w,zi + A{l),...,Zk + A{k)) if{q 2 ,A) G 5((7i,a) 
where (^(1), • ■ • , A{k)) = A' is the transpose of vector A and we omit the sub- 
script M if no confusion will arise, denotes the reflexive transitive closure 
of the computation relation and is defined as usual from the n-step compu- 
tation relations := I o Vjj- by := IJ where is the identity 

i>0 

relation on the ID ’s of the nondeterministic automaton M . 

IDihj^IDj is an accepting computation for w iff ID i := (qo,w,0, ... ,0)) and 
3<7e G Qfin such that IDj := {qe, A, 0, . . . , 0)). 

L{M) -.= {w & E* \ M has an accepting computation for tu} is the language 
accepted by M . 

A specific k-counter automaton M can most easily be described by a finite 
state transition diagram in which a directed arc from state q\ to q 2 is inscribed 
by the input symbol x to be processed and a vector A G {+1,0,—!}^ used for 
updating the counters by adding the component A(i) of A to the current contents 
Zi of the i-th counter. This will be written as q\-^ <72- 

Definition 3. A blind k-counter automaton M := {Q, E,SM,qo,Qfin) accepts 
L{M) in in linear time with factor d G IN, if for any w G L{M) there exists an 
accepting n-step computation IDqV^ ID\ for w such that n < d ■ max{\w\ , 1). 

If there exists d £ IN such that {qi, X, Z\, . . . , Zk)\~^{q 2 , X,z[,..., z'jf) implies 
n < d, then the automaton M is said to work in quasi-realtime of delay d. If in 
this case d = 0 then M works in realtime. 

The i-th counter (1 < i < k) of some blind k-counter automaton M is 
reversal-bounded iff for any subcomputation {qo, w,0, . . . , 0)\-^{qi,wi,xi, . . . , Xk) 
'^{q 2 ,W 2 ,yi, . . ■ ,yk)'^{q^,w^,zi, ...,Zk) Xi> yi implies yi > Zi. 

By this definition, a reversal-bounded counter has to be increased first and 
decreased after its reversal. Counters that are first decreased and solely increased 
after one reversal can be replaced by those required by Definition 3 above. In 
addition, reversal bounded counters are forced by the finite control to perform 
at most one reversal on each computation, even in the non-accepting ones! 

Definition 4. For all k,r £ IN let {k,r)-RBC denote the family of languages 
accepted by {k,r)-counter automata, i.e., are accepted by on-line counter au- 
tomata having k blind counters of which r are reversal-bounded. 

Obviously we have M.{Ck) = (fc, 0)-RBC and M.{Bk) = (fc, k)-RBC . 

Definition 5. Lk^r '■= {w £ Ff \ Vi<i<r : hflw) = aflb'fl for some m £ IN A 
Vr-|-I<i<fc : \hi{w)\^, = \hi{w)\f^,}. 
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By results from Ginsburg and Greibach ([GiGr 70], Gorr. 3, and [Gins 75], 
Prop. 3.6.1,) one can deduce that the language Lk,r is a generator of the family 
{k,r)-RBC . We do not give a detailed explanation using these standard tech- 
niques and state Lemma 1 without proof: 

Lemma 1. {k,r)-RBC = A4{Lk,r)- 

Greibach showed Gi G A4(Bs), (Lemma 1 in [Grei 78]). It was shown in 
[Jant 98] that it is sufficient to accept Ck using only k-hl reversal-bounded coun- 
ters, which is stated in Lemma 2. 

Lemma 2. ~ik G lN,k > 1 : M{Ck) ^ M{Bk+i) ■ 

Ginsburg ([Gins 75] Example 4.5.2) has shown M{Bk) ^ And 

M(Ci) ^ Al(Ci+i) has been shown in [Grei 76], [Grei 78]. 

We will obtain the sharpening of the above results by proving Lemma 6 and 
the main result Theorem 1. 

For the formulation and usage of techniques from linear algebra to prove 
these results we need some more notation that in most cases applies only to 
those (fc, r)-counter automata which accept languages from 

Definition 6. For any {k,r)-counter automaton A := (Q, S,6A,qo,Qfin) let 
Ga F Q X E X {-1-1, 0, —1}* X Q be the finite set defined by Ga '■= {{p, x, A, q) \ 
{q, A) G Sa(p,x)}, which is in bijection with the arcs of A ’s state diagram. For 
later use let ua '■= \Ga\ be the number of elements in the arbitrarily but fixed 
ordered set Ga = {gi, 92 , ■ ■ ■ ,9nA}- (The ordering that is actually used depends 
on L{A) and will be described later.) 

The four projections tt^, 1 < t < 4, tti, 7T4 : Ga ^ Q, '■ Ga AU {A}, and 
ttz'Ga^ (-1-1, 0,-1}'", are defined by: tti{{p,x, A,q)) := p, tt 2 {{p,x, A,q)) := 
X, ttsUp, X, A, q)) := A, irfiip, x, A, q)) := q. 

The mappings tti and 7T4 are mere coding, whereas 7T2 and tts are canonically 
extended to homomorphisms, by mild abuse of notation: For all strings u,v G 
G\ let 7T2 : G\ E* with tt 2 {uv) = tt 2 {u)tt 2 {v) and tts : G\ with 

TT^^uv) = tt 3 {u) + 7T3(w), where + is the componentwise addition of the vectors 
tt 3 {u) and tts^v). For an easier readability let Ag := T: 3 {g) denote the counter 
update induced by the transition g G Ga of A. 

Let Ra := ‘ ' 9h \ t G IN G {0, . . .t} : {g^^ G Ga) A {'Ki{gifi) = 

9o) A {TH{9it) G Qfin) A {TTiigifi) = 7ri(5i^^J for p yf t)} C G\ be the regular 
set describing all the accepting paths in A ’s state diagram, interpreted as finite 
automaton with input alphabet G. 

Of course, w G Ra does not imply that tt 2 {w) will be accepted by the counter 
automaton A, since the final counter values may not be equal to zero. Note, that 
the number of reversals of the reversal-bounded counters are handled by the finite 
control and can never be wrong. 

On the basis of a (A:, r)-counter automaton A := {Q, Fk,SA,qo,Qfin) two 
matrices A a and Ap are defined. 

Definition 7. Aa G js defined for each component, 1 < i < k, 1 < j < 

UA by: 

AA{i,j) ■■= Agfii). 
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Hence A/\ can be written as composite matrix as follows: 

Aa = ^ 

With the notation from Definition 6 we see that A a ■ tp{v) = tt 3 {v) for each 
V G G\ and the following is a consequence of the definition of acceptance for 
(fc, r)-counter automata: 

Lemma 3. Let A := (Q, S,SA,qo,Qfin) be some {k,r)-counter automaton then 

Vv G Ra : Aa ■ i^{v) = 0 iff tt2{v) G L{A). 

Proof: V G Ra ensures that there exists a path in the state diagram of A 

beginning in and ending in some final state of Qfin- If in addition 7T3 (u) = 

Aa ■ = 0; then 7T2(v) G L(A). Conversely, for any w G L(A) there exists 

an accepting path in A having a corresponding string v' G Ra with w = 7T2 (v'). 
Since a (fc, r)-counter automaton accepts if the fc-counters are empty at the 
beginning and at the end, it follows that 7T3 (u') = Aa ■ ip{v') = 0. D 

Definition 8. For each {k,r)-counter automaton A := {Q, rk,SA,qo,Qfin) the 
following matrix Ar G {+1,0,—!}^^""^ is defined for each component Ap{i,j), 
1 < i < k, 1 < j < UA, by: 



( 1 , if T^2{9j) = ai 

Ar{i,j) ■= ^ “1 > */7T2(gg) = b, 

I 0 , if 7 : 2 ( 9 j)i{ai,bi}. 



Without loss of generality the ordering of the elements in Ga is such, that 





/I 


... 1 -1 


... -1 ... 


0 ... 0 0 . 


0 

0 




Ar = 


0 


... 0 0 


... 0 ... 
















0 ... 0 0 . 


. 0 : 






U 


... 0 0 


... 0 ... 


1 ... 1-1 . 


.-10. 


• 0 ) 








= (71 72 


7nA^ ) 






where 


7g 


denotes the j- 


-th column Ap 


(:,j) of Ap. 







The next fact is obvious from the definitions and formulated without proof: 

Lemma 4. Let A := (Q, Fk, Sa, qo, Qfin) be some (k, r)-counter automaton then 

\/vGG*a- Ar-fi’(v)=0 iff tt 2 (v) G Ck- 

We combine the preceding Lemmas (3 and 4) to get an equality which is 
independent from the number of reversal bounded counters but, through Ra, 
not independent of the language accepted: 
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Lemma 5. Let A := {Q, rk,6A,qo,Qfin) be some {k,r)-counter automaton ac- 
cepting L{A) C Ck, and let denote the compound matrix of dimension 

2k X UA then 



{v & Ra \ Aa- ip{v) 



0} = {?; G Ra 




0 }. 



Definition 9. 



Bk,r ■■= {afVf 



'-bl’ 



ir+l 7 ir+l+jr+l „jr+l 



r+1 ^r+1 



r+1 



■ a 



iklik+jk Ak 



k "fc 



at 



Vfi : ifijfi G IN}. 



Lemma 6. (fc, r + 1)-RBC ^ {k, r)-RBC for all k G IN and 0 < r < k. 

We will in fact prove Bk^r ^ (fc,r + 1)-RBC, where the subset Bk,r 5 ^k,r 
is defined above (Def. 9). We will see, that the equation of Lemma 5 cannot 
be satisfied if Bk,r is accepted by using r+1 reversal bounded counters. That 
this suffices is obvious, since B^ r is obtained from by intersection with an 
appropriate bounded regular set, hence Bk^r G {k,r)-RBC is easily seen. 

The proof of Lemma 6 is quite involved and needs a lot of definitions first. 
For the sake of contradiction, let us assume Bk,r G {k,r 1)-RBC and let 
A := {Sa, Bk, Sa, Qfin) be a blind fc-counter automaton having r+1 reversal 
bounded counters that accepts Bk,r = L{A). Without loss of generality, we 
assume that the first r + 1 counters are reversal-bounded. 

Definition 10. For each k € IN,k 0 and each l,l < I < k let C Ga x Ga 
be defined by: 

{ 3x e Fk: Tr 2 {g),TT 2 {g') & {x,X} , and 

Vl<j<L-Z\g(j)>0=+Z\,7(j)>0, 

{g, g') G means that the counter automaton A does not read two different 
symbols from the input by using g and g' , if any at all, and these arcs do 
not force a reversal on any of the counters with index less or equal to 1. The 
remaining counters with index strictly larger than I do not have any restriction 
on their updating. The relation is obviously symmetric and reflexive but not 
necessarily transitive. So we can only find subsets of C C Ga x Ga which are 
transitively closed. Any such set will be called a ^[-clique. 

Within the set Ra we identify a certain non-regular subset Ki to be used for 
the proof of Lemma 7 below. 

Definition 11. The set 

Ko := {w e Ra \ 3i & IN tt 2 {w) = albla^bl ■ ■ ■ • • • albfa}} 

is a non-regular subset of Ra of which we select the set K\ C Kq (f Ra where 
no two different strings have an identical TT 2 -projection: 

Ki := {ru G Kq | Vic' G Kq : 7T2(w) = 7T2(w') implies w = w'}. 
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By Wi we denote the unique string in K\, for which 



= a\b\a\b\ ■ ■ ■ ■ 






2i A 



au. 



For a step-by-step definition of a specific non-regular subset of Ra which 
contains infinitely many strings from the set K\ we use the property Pf-r+i to 
specify certain strings within the set K\. 

Definition 12. p^r+i : G\ x IN ^ {true, false} is defined by: 

p. r+i {w, p) = true iff 3ui, . . . ,Up € G\: 

1. w = uiU2---Up and 

2. Vj, 1 < j < p : Vp,s' eGA--g,g' E Uj^{g,g') G 

3- Vj, 1 < j < p : 3 G Ga : g E Uj A g' E Uj+i A (g,g') ^ 

For each u G G*^ let G{u) := {g G Ga \ g E u}> where E denotes 

the substring relation. 

Here, G{uj) forms a (maximal) ^^^^-clique for each Uj of the decomposition 
w = uiU2 ■ ■ ■ Up. If two arcs g, g' G Ga are in the same ^(!~'’^-clique, then there 
exists X G -Tfc such that 7T2 (g) , 7T2 (g') G {A,x} and their TTs-projections do not 
lead to a reversal on one of the first r -I- 1 counters. The change between two 
“''^-cliques can thus be forced either by changing the symbols A) of the 
7T2-projections or by performing a reversal on one of the first r -|- 1 counters. 

For each Wi G Ki we have Ppr+i (wi,p) = true implies p < 3fc -I- 1. This is seen 

^k 

as follows: There exist at most 2r-|-3(fc — r) different ^-cliques with a compo- 
nent from Fk, since there are at most that many different blocks of consecutive 
identical symbols. Because (g,g') G also allows 7T2(g) = T^ 2 {g') = A, some of 
these arcs may fall into the neighboring ^^^^-clique, as long as these arcs do not 
force a reversal on one of the first r-|- 1 counters. At most r-|- 1 reversals may fall 
into the 2r-|-3(A: — r) different blocks, which allows for r+ 1 additional substrings 
in the decomposition of Wi = U\U 2 • • • Up and p <2r + 3{k — r) + r + 1 = 3k + 1. 

Since fc is a constant and K\ is infinite, there exists some p < 3k + 1 such 
that infinitely many strings w G Ki satisfy p^r+i(w,p) = true. This gives rise to 
the subset K 2 E Ki defined next. 

Definition 13. Let p < 3k + 1 be fixed and such that K2 := {ic G ATi | 
p^r+i{w,p) = true} is infinite. Let ff{K2) := {i G iV | rui G K2} denote the 
index set for the strings in K2. 

Since Ga is finite there exists a fixed string Wg := giA 9 i ,2 • • • 9 i,p G G^ where 
gij is the leftmost symbol of uj for each 1 < j < p in the decomposition of 
w = U1U2 • • • Up for infinitely many strings w G K2. These strings are collected 
in the set K 3 C K 2 : 

Definition 14. Let Wg = giA9i,2 • • ■ 9i,p G G\ be fixed and such, that 

:= K2 n {gi,i}G(Mi)*{gi_2}G(rt2)* • • • {giAG{up)* , is infinite. Moreover, 
ff{Kf) := {i G W I iCj G Kf} denotes the index set for the strings in K3. 
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The set C K 2 C Ki is not regular but we shall find an infinite regular 
set L C {gi,i}G{ui)* {gi^ 2 }G{u 2 )* ■ ■ ■ {gi,p}G{up)* such that Ra- 

For each j, I < j < p, let Lj be the regular set accepted by the finite 
Automaton Aj := {Qj,G{uj), 6 j,Tri{gij),Qj^fin), where 

1. Qj := {TTi{g),TT4{g) I g G G{uj)}, 

2. Sj : Qj X G{uj) Qj is given by 6 j{TTi{g),g) := TT 4 {g), 

3- Qj,fin '■= where g' is the rightmost symbol of Uj. 

Since each accepting path in the automaton Aj is a part of an accepting path 
in A' s state diagram, we see that C L C Ra for L := L 1 L 2 ■ ■ ■ Lp. Moreover, 
at least 3k — r languages among the Li,L 2 , ■ ■ ■ -Lp must be infinite, since the 
projection of the elements of K 3 onto the elements of Fk are infinite for each 
of the 2r + 3{k — r) blocks of identical symbols. Since L is regular, the Parikh- 
image is a semilinear set and infinite, too. The sum is 

understood elementwise for the p semilinear sets ip{Lj). Each linear subset of 
has a representation of the form: 

{Cj + PjY I Y G for some hj > 1, Gj G and P, G 

With these preliminaries we can formulate and prove the following important 
result: 

Lemma 7. There exists an infinite set K C Ra such that a) to c) hold: 

a) -f{K) = {G+PY I Y G for some heW, G e and P e 

b) If P{s,j) ■ P{t,j) yf 0 /or 1 < j < /i, 1 < s,t < then {gs,gt) G 

c; Vno G JV : 3 To G : (Vj : 1 < / < A To(j) > no) A C + PYq G /’(iFi). 

Proof: By definition of the finite automaton Aj the matrix Pj satisfies b) 
of Lemma 7. Given L := L 1 L 2 ■ ■ ■ Lp we choose for each Lj a linear subset 
{Gj + PjY I Y G INj} C ip{Lj) which should be infinite whenever Lj is infinite. 

The set S := {Gs + PsY \ Y G INg} defined by Cs '■= X) Qj bis ■= X bij, and 

i=i 

the compound matrix Ps := P 2 • • • Pp^ G is linear and infinite, 

too. The matrix Ps satisfies property b) of Lemma 7, since each submatrix 
Pj fulfilled this property. Now, K := L C\ is an infinite subset of L 

containing infinitely many elements from K 3 C Ki, thus satisfying properties a) 
(by ip{L') = S) and b) of Lemma 7. Of course we had to choose the appropriate 
linear subsets of each Lj to see that contains infinitely many elements 

of ATa Cf. L = L 1 L 2 ■ ■ ■ Lp. Now we modify the matrix Ps by omitting certain 
columns to obtain a matrix P that also satisfies c) of the Lemma. First, K n 
K 3 C Ra is infinite, so that there exists an infinite set M C such that 
'tp{K n Kf) = {G 5 + PsY I Y G M}. Let mo, mi, ... , m^, ... be any enumeration 
of the elements of M = {mi \ i G W}. Then there exists a subset M' = \ 

y j G IN : ij G IN A mij G M Aij < ij+i\ fk M such that for each j, I < j < hs'- 




384 Matthias Jantzen and Alexy Kurganskyy 



either for all ti,i 2 G IN, 

or Trii^{j) < rrii^ij) *i < * 2 - 

This result is a variant of Dickson’s Lemma and can be proved easily. 

From M' we deduce the following index sets and constants: 

he ■■= {j \ l< j <hs,^l>l-. < TO*,+i(j)}, 

leg ■={j \ ^<j <hs,yi>l: and 

Cj := mi,{j) for each j G leg. 

Now, 

n X 3 ) = {Cs + PsY \ Y & M}^{Cs+ E Ps{-,j)cj + PY \Y & M"}, 

jelcg 

where P G is obtained from P 5 by omitting the columns Ps{'-,j) having 

index j G I eg, h := hs — \Ieq\, and M” is obtained from M' by omitting all 
components j, where j G leg- Thereby, M” ^ is a set which can be linearly 
ordered by < and this relation applies to all components of its elements. Thus, 
also property c) of Lemma 7 is satisfied, and the proof is finished. EH 



Lemma 8. 



Proof: Let 



rank 



Yi Y2 

Z\ Z 2 



-^A 

Ar 



P 



> rank (A/i • P) . 



Yh 

Zh 



(Aa 

V Ar 



■P, where for each 1 < j < h, the 



columns Yj := ( A a ■ P) (:,j) of Aa-P are given by Yj{l) := E Aa{ 1, i)P(i,j) := 

i=l 

riA 

E Ag.{l) ■ P{i,j) for 1 < / < A: and likewise Zj := {Ap ■ P) (:,j) denotes the 



We still have to verify: rank 



j-th column of Ar • P with Zj{l) := E Ar{l,i)P{i, j). From b) in Lemma 7 one 

i=l 

concludes that each column Zj, 1 < j < h, has at most one non-zero component: 
if Zj{l) yf 0 then Zj{i) = 0 for each i^l. 

z, zl r, ... YA 

By the definition of matrix Ap (Def. 8) and the construction of P (Lemma 7) 
one readily verifies that the rows of the compound matrix (Zi Zi • • • Z\h) 
are linearly independent. For later use let a\,o. 2 , - ■ ■ ctk and /3i, P 2 , ■ ■ ■ Pk denote 
the rows of ( Yi Y 2 • • • Y^), respectively those of ( Z 2 ■ ■ ■ Zh). 

Each word Wi G K H K 3 , i G #{Kz) can be written as 

(i) (i) (i) (i) (i) (i) (i) (i) (i) (i) (i) (i) i 

Wi = • • • K'lKhKiisK+ia'^riiA • • • where 



1. tt 2 {u's^i) = a\ and 7r2(Us*2) = ®^ch s, 1 < s < r, 

2. 7T2(r(;^*i) = 7r2(rCg*3) = a\ and 7r2(rCg*2) = ®^ch s, r -|- 1 < s < A. 

If P{i,j) ^ 0 then for each no G IN there exists w G K such that |rt;|g. > no- 
This follows from c) in Lemma 7. Let G\ := {g G Ga \ V uq G IN : 3 w G 
K : |r<;|g. > no} be the set of all these arcs. Now we want to show that the 
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row-space {ai, 02, . . . / 3 i, /32, • ■ • / 3 fc} contains strictly more linearly indepen- 

dent elements than the row-space {/ 3 i, /?2, • ■ • fik} if Bk^r = L{A) for the counter 
automaton A having r-l -1 reversal bounded counters. We distinguish two cases: 

1 . Assume that one of the reversal bounded counters will be changed by an 
arc g G n G\ for some I, r + 1 < I < k. W.l.o.g. we assume that 

this is the first counter, hence Ag{l) = 7r3(^)(l) yf 0 . By choosing two more arcs 

from G{wi^lwi^l'Wi''l) n Gi we can always find three elements i 5 ms ^ 

1 < < nA, such that: 



1- 5 G 

2 . E 5^2 E w\"l, and 5^3 E w\"l, 

3. if g yf Ofij for some 1 < j < 3 then 7T2((7^3-) yf 0. 



Now consider two triples y := (yi,j/2,2/3) and 2; := (21,2:2, ^3), where j/i, j/2, 
and 7/3 are entries of the matrix {Y\ Y2 • • • Yh) and 21, 22, and 23 are entries 
of the matrix {Zi Z2 ■ ■ ■ Zh). y and 2 are specified as follows: for 1 < j < 3 
the elements yj are located in the first row «i of ( Yi Y2 • • • Yh) and the ele- 
ments Zj are located in the Eth row ( 3 i oi {Z\ Z2 ■ ■ ■ Zh) and their crossing 

with some column | j , where 1 < rrij < h for 1 < j < 3 , and rrij is such, that 

P{fj,j,nij) yf 0 . As mentioned before, each column of ( Z2 ■ ■ ■ Zh) has at 
most a single entry not equal to zero. Since 7^2(9 hi) ^ '^2(9 11.2 ) G {op E, A} 
these entries must occur in the /-th row j 3 i of ( Zmi Z^^ ), hence applies 

to the elements 21, 22, and 23. Consequently, if y was linearly independent of 2 
then also the first row a\ would be linearly independent of the rows Pi, P2, ■ ■ •, 
Pk- This would imply the statement of Lemma 8. Thus it suffices to prove that 
indeed y and 2 are linearly independent. 

Among the cases g = g^,^, g = (/^2, or g = 3^3 we select g := g^,^ as subcase 
1.1 (the remaining cases are similar): 

By the choice of g we have Ag(l) yf 0 which implies yi yf 0 . Now, either 2 = 
( 0 ,-l,l) or 2 = ( 1 ,- 1 , 1 ) by definition of , 5^2 , Since 2 = ( 0 ,-l,l) 
means independence of {y, 2} we proceed by assuming 2 = ( 1 , — 1 , 1 ). Since the 
first counter is reversal bounded, only the following choices are possible for y: 
2/1 > 0,7/2 > 0,1/3 yf 0 , 1/1 > 0,1/2 < 0,7/3 < 0 , or yi < 0,i/2 < 0,7/3 < 0 . It is 
immediately verified that in all these cases y is linearly independent of 2. 

2 . We next have to consider the case, that for each l,r + 1 < I < k no arc 
g G G{w\''lwl''lwl^l) n Gi updates one of the r-|-l reversal bounded counters. Let 
G2 := G( 7 c^!|;^ 2^r+i.3 ' ’ ' 'A G\ be the relevant set of these 



arcs. Again we consider matrices defined from columns of 




IE 

^2 



Yh 

Zh 



as follows: 



Let ( Yjyi^ Yjji2 * * * Yjyi^ ) and ( Zjyi^ ^m2 ' ' ' Zjn ^ ) be the matrices 

consisting of those columns of ( Yi ••• Yli ), respectively of ( ••• Zh),ior 

which P{j, rrii) yf 0 and gj G G2 where 1 < j < ua, and 1 < rrii < h for all 1 < 
i < q. Now, g G G2 implies Ag{j) = 0 for 1 < j < r+ 1 , since none of the reversal 
bounded counters is modified by an arc from the set G2. Consequently (j) =0 
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for 1 < j < r + 1 and each I < i < q, so that rank ( Ymi • • • Y ^^ ) < 

k — (r + 1). On the other hand, rank ( Zmi Zm 2 ' ' ' ) = k — r, since 

rank {Zi Z^ • ■ ■ Zh) = k and r rows of ( Zmi Z^^ ' ' ' ) have an 

entry equal to zero (recall T^ 2 {g) ^ -O- for g G G 2 ). Also in case 2. the statement 
of the lemma has been proved. EH 

Let A G be of rank r, B G and L := {x £ IN^ \ Ax = B} be the 

set of all non-negative solutions of the linear equation Ax = B. It is known from 
Linear Algebra that each subset M C L of linearly independent elements has 
cardinality of at most (h — r). 

Lemma 9. If L := {x £ IN^ \ Ax = B} for some A £ gf rank r and 

B £ . If for each n £ IN there exists x £ L such that x{i) > n for each i, 1 < 

i < h, then L contains a subset M = {x\,X 2 , ■ ■ ■ ,Xh-r} of linearly independent 
elements. 



Proof: Let yi,y 2 , ■ ■ ■ ,yh-r £ be linearly independent solutions of the 
homogenous linear equation Ax = 0 and define no '■= max{ |yi(j)| |l < i < 
h, 1 Y j Y r — hj. Now, if xq G L is a solution of the inhomogeneous linear 
equation Ax = B that satisfies xo{i) > no, then xg + yi, xq + y 2 ,- • • + Uh-r are 

linearly independent and non-negative solutions of the equation Ax = B. EH 



Proof of Lemma 6: From Lemma 5 we see that L{A) = Bk^r C Ck implies 

{v £ Ra I Aa ■ tp(v) = 0} = {w G Ra I ^ ^ • 'f’(v) = 0}. Using K C Ra from 

Lemma 7 a) with f/'(Ar) = {C + P-Y \ Y £ IN^} and by b) there exists Yq £ IN^ 
for each no £ IN with Yo{j) > no for each I < j < h and a string w £ Ra such 
that 7T2 (w) £ Bk^r, Ar ■ ip{w) = 0, Aa ■ 'f’iw) = 0, and C + P -Yo = fj{w). This 
yields the equation 



(*): 



{Y £IN’^\Aa-P-Y = -{Aa) ■ C) = 



|y G iN’^ 



(Aa 
\ Ar 



■ P-Y = - 



(Aa 
\ Ar 




and 



Yq £ {y G IN^ I Aa ■ P Y = —{Aa) • C} • But by Lemma 9 and Lemma 8 we 
see 



rank {Y £ \ Aa ■ P ■ Y = ~{Aa) ■ C] > 



rank < F G IN 



Aa 

Ar 



P-Y = - 



Aa 

Ar 



which means that equation (*) cannot be fulfilled and Bk^r ^ {k,r ■ 




1)-RBC. 

□ 



The above results yield our main Theorem: 

Theorem 1. {ki,ri)-RBC ^ {k 2 ,r 2 )~RBC iff (k\ < k 2 ) or (k\ = k 2 and 
ri > r2). 
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Proof: The mere inclusion {ki,ri)-RBC C {k 2 ,r 2 )-RBC if (fci < ^ 2 ) or 
(fci = k 2 and ri > V 2 ), follows from the definition of the family {k,r)-RBC 
(Definitions 2 to 4). The strictness of {k\,ri)-RBC ^ {k 2 ,r 2 )~RBC if ki < k 2 
is verified as follows: 

By definition we have {k,ri)-RBC C {k,0)-RBC for any ri < k, the strict 
inclusion {k,0)-RBC = M{Ck) ^M{Bk+{) = {k+l,k+l)-RBC is Theorem 2, 
and again by definition (fc + 1, k+l)-RBC C (fc + 1, r 2 )~RBC for any V 2 < k+1. 
Finally, the inclusion {k,r+ 1)-RBC yf {k, r)-RBC for all k G IN and 0 < r < k 
has been shown in Lemma 6. □ 
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Abstract. As an evidence of the power of finite unary substitutions we 
show that the inclusion problem for finite substitutions on the language 
L = ab*c is undecidable, i.e. it is undecidable whether for two finite 
substitutions ip and ip the relation ip{w) C ip(w) holds for all w in L. 



1 Introduction 

Finite substitutions between free monoids are natural extensions of correspond- 
ing morphisms. However, due to their inherent nondeterministic nature, they 
behave in many aspects very differently. A goal of this paper is to emphasize 
this difference in a particularly simple setting. 

Finite substitutions, as well as their images, i.e. finite languages, have been 
studied rather intensively during the last few years. Such research has revealed a 
number of nice, and also surprising, results. In [LII], see also [HH], it was shown 
that the question whether two finite substitutions are equivalent, word by word, 
on the language L = a{b,c}*d is undecidable, in other words, the equivalence 
problem for finite substitutions on the language L, and hence also on regular 
languages, is undecidable. In [CKO] all finite languages commuting with a given 
two-element language were characterized, and as a byproduct Conway’s Problem 
for two element sets was solved affirmately. Conway’s Problem, see [C], asks 
whether the maximal set commuting with a given rational X, referred to as its 
centralizer, is rational as well. Very recently Conway’s Problem for three-element 
sets was also solved in [KP], but the problem remains open even for finite sets 
X. The general problem, as well as some related ones, seems to be very hard. 

An intriguing subcase of the problem solved in [LII] is the case when L' is 
assumed to be ah* c, i.e. a very special bounded language. This problem was 
posed, at least implicitly, in [CuK] and has, so far, avoided all attempts to be 
solved. In [KL] some special cases, as well as related problems, were considered. 
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One result of [KL] shows that the inclusion problem for finite substitutions 
on regular languages is decidable if the substitutions are (or in fact only the sim- 
ulating one is) so-called prefix substitutions, that is the images of the letters are 
prefix sets. Here we show that the restriction to prefix substitutions is essential. 
Indeed, otherwise the problem becomes undecidable even in the case when the 
language equals to L' = ab*c. The corresponding equivalence problem remains 
still open. 

This paper is organized as follows. 

First in Section 2 we fix the needed terminology, and recall the basic tool 
used here, the notion of a nondeterministic defense system. Section 3 is devoted 
to our main undecidability result. In Section 4 we consider applications of our 
result, as well as some related ones. 

In this extended abstract the proof of the main result is only partially pre- 
sented, and some other proofs are omitted. 

2 Preliminaries 

In this section we fix our terminology, introduce our problems and recall the 
basic tools needed. For undefined notions in combinatorics of words we refer to 
[ChK] and in automata theory to [B] . 

Let 27 be a finite alphabet, and 27* (resp. 27+) the free monoid (resp. semi- 
group) generated by 27. We denote by 1 the unit of 27*, so that 27* = 27+ U {!}. 
For two finite alphabets 27 and A we consider finite substitutions ip : 27* ^ A* 
which are many- valued mappings and can be defined as morphisms from 27* into 
the monoid of finite subsets of A*, i.e. into 2"^ . If is single- valued it is an or- 
dinary semigroup morphism 27* — > A* . By a 1 -free (or e-free) finite substitution 
we mean a finite substitution ip for which 1 is not in ip{a) for any a in 27. 

Let ip, if be finite substitutions 27* ^ A* and L C 27* a language. We say 
that ip and ip are equivalent on L if and only if 

ipiw) = ip{w) for all w € L. 

Similarly, we say that ip is included in ip on L if and only if 

ip{w) C ip(w) for all w G L. 

9 

We note that the question ip{w) C ip[w) (for a fixed w) can be viewed as a task 
of finding a winning strategy in a two player game: in any choice for values in 
ip{a) the Ip must be able to respond following the input word. 

Now we can state two important decision problems. 

Problem 1 (Pi). Given two finite substitutions ip, ip : E* ^ A* and a rational 
language L C 27*, decide whether or not ip and ip are equivalent on L. 

Problem 2 (P 2 ). Given two finite substitutions ip, ip : E* ^ A* and a rational 
language L C 27*, decide whether or not ip is included in ■;/: on L. 
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There are two obvious remarks. First, using any standard encoding we can 
assume that A is binary, say Z\ = {0, 1}. Second, in the special case of morphisms 
the problems are equal, and easily seen to be decidable. 

For finite substitutions the situation changes drastically. Indeed, even in the 
case when the language L is chosen to be fixed, the problem seems to be very 
difficult. From the point of view of this paper interesting subcases are obtained 
when L is fixed to be ab* c - a very special bounded language. In this case we 
restate the problems as follows: 

Problem 3 (UP\). Problem Pi for the fixed language L = ab*c. 



Problem 4 (UPi)- Problem P 2 for the fixed language L = ab*c. 

We use U above as an indication that the problems deal with finite substi- 
tutions which are essentially over a unary input alphabet. More precisely, we 
consider problems on unary finite substitutions augmented with endmarkers. 

Problem P\, and hence also P 2 , was shown to be undecidable in [LII] even in 
the case when L is fixed to be the language a{b, c}*d. Actually the undecidability 
of Problem P 2 is very easy to conclude. On the other hand, these problems are 
not decidable only in the case when the mappings are morphisms, but also in the 
case where they are so-called prefix substitutions, i.e. the images of the letters 
are prefix sets, cf. [KL]. In fact, for the decidability it is enough that is a 
prefix substitution. Several related problems are considered in [M], [TI], [Fill] 
and [Til]. 

So the interesting remaining problems are U P\ and U P 2 ■ We are not able to 
solve UP\ here, but we do solve UP 2 - And surprisingly the answer is negative: 
the problem is undecidable. 

The basic tool in our proof is to use so-called nondeterministic defense sys- 
tems. A nondeterministic defense system, ND-system in short, over the alphabet 
Z\ is a triple V = (Q,P,qi), where Q is a finite set of states, <71 is the unique 
initial (or principal) state and P is a finite set of rules of the form (p,a,q,z), 
where p,q G Q, a G A and z € {—1, 0, 1}, that isPC(5xZ\xQx {—1, 0, 1}. 
We say that the fVP-system V is reliable if and only if, for each w = a\ ... at, 
with Oi G S ior i = 1, ... ,t, there exist states q\, . . . , qt+i such that 

{qi,ai,qi+i,Zi) G P for i = 1, . . . , t, (1) 



and moreover, 



t 

J2zi = 0. (2) 

i=l 

We emphasize that the sequence (1) can be interpreted, in a natural way, as a 
computation in a finite transducer: w corresponds to the input, qfs determine the 
state transitions and the numbers Zi are the outputs produced in each step. The 
essential condition is the condition (2) which requires that the sum of the outputs 
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equals zero. Such computations are called defending. Hence the reliability means 
that for each input word there exists a defending computation. Now, a crucial 
result is the following 

Theorem 1. It is undecidable whether a given N D-system is reliable. 

The proof of Theorem 1 can be found in [LI] . It uses the undecidability of the 
Post Correspondence Problem. Actually the original iVD-systems were equipped 
with probabilities, but those are not needed in the above proof. It is also obvious 
that A can be fixed as long as it contains at least two symbols. We fix Z\ = {0, 1}. 



3 The Main Result 

This section is devoted to the main result of this paper and to its proof. 

Theorem 2. The inclusion problem for 1-free finite substitutions on the lan- 
guage L = ab* c is undecidable. 

Proof. We reduce the undecidability to that of the reliability of iVD-systems. 
Let V = {Q, P, qi) be an iV D-system over {0, 1} and with Q = {qi, . . . , < 7 ^}. We 
associate V with a pair {ip, xf) of finite substitutions 

{a, &,c}* ^{0,1,2,3,4,5,61* 



such that 

V is reliable (3) 

if and only if 

(p{aPc) C -tp(aPc) for all i > 0. (4) 

Hence, by Theorem 1, the result would follow. 

Before defining and xp we have to fix some terminology. We define 

W = vi . . . with Vi = 0*1234 for i = 1, . . . , s -b 1. (5) 

Consequently, W G (0, 1, 2, 3, 4}+. Further we set Wkj = Vk-.-Vj for 1 < k < 
j < s -b 1. Next, for k,j G {!,..., s}, a G {0, 1} and y G { — 1, 0, 2} we define 
words 

F{a,k,j,y) = Wk+i,s+i(S{a)S{a)W)y~^^S{a)S{a)wij, 



and 

Ba = S{a)S{a)WS{a)S{a)W, 



where S'(a) = 5 -b a. 
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Now, using the word F we define three new sets of words: 

(i) Ii{a,k,j,-l) = F{a,k,j,2) 

tin J = F{a,k,j,Q){M)-^ 

\T2(a,j,j,0) = 34F(a,j,j,0) 
f h{a,k,j,l) = F(a,fc, j,-l)(234)"^ 

(iii) < M3(a, = 234F(a,j,j, 0)4-1 

[73(a,j,j, 1) = 4F(a,j, 

Here we use the notation uv~^ for the right quotient of u by v. Note also that the 
fourth argument of these words inside any group (i)-(iii) is always a constant, 
either —1, 0 or 1. The abbreviations 7, T and M come from the words initial, 
terminal and middle, respectively. From now on we may talk about 7- or T2- 
words, for example. 

Next, out of the set of all above 7-, T- and M-words we select some, based 
on the rules of the defense system V, to constitute a language L. It consists of 
exactly the following words: 

{ Ii(a,k,j,-1), if (k,a,j,-l)eP, 
h{a,k,j,Q) and T 2 {a,j,j, 0 ) if {k,a,j,0)GP, (6) 

h{a,k,j,l), and T 3 {a,j,j,l) if (fc, a, j, 1) G P. 

Here, of course, a, k and j range over the sets 

{0,1}, {!,..., s} and {!,..., s|. 



respectively. 

Now, we are ready to define the required finite substitutions. The substitution 
ip is defined by 

p{a) = p{c) = W and p{b) = {BoBo,BiBi}. 

Consequently, for each n > 0, we have 

ip(ab^c) = IT{(55IT)^ (66IF)^}”IF, 
where W is defined in (5). 

The substitution ip, in turn, is defined by the formulas 

ip{a) = wii 
ip{b) = LL, 

ip{c) = {7 I 7 = Wk,s+iW with 2 < A: < s + 1}. 

It remains to be proved that the construction works as intended, that is: the 
conditions (3) and (4) are equivalent. In this extended abstract we prove the 
implication in only one direction. 

Assume that V is reliable. We have to show that, for each n > 0 and each 
word z of the form 

z = uqUi . . . Un+i with uo = Un+i = W and Ui G {BqBo, BiBij 
for i = 1 , . . . n 
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there exist words vi, . . v„ G 4’{b) and Vn+i G ip (c) such that 



WiiVi . ..v„v„+i = z. 



In the case n = 0 we can choose v\ = W 2 ,s+iW so that wnVi = WW as required. 

Assume now that n > 1. Next we use the assumption that V is reliable. 
Consider the word t = oi . . . a„ G {0, 1, }” defined by 

tti = a if and only if Ui = BaBa, for t = 1, . . . , n. 

Since V is reliable, there exist states qi = qj^, . . ., qj^+i and numbers Zi, 
. . Zn G { — 1, 0, 1} such that 

{qji,ai,qji+i,Zi) G P, for (7) 

and moreover, 

n 

^Zi~- 

i=l 

The numbers Zi in (7) define, for i = 1, . . 

I 2 T 2 or I 3 M 3 T 3 depending on whether Zi 
such a word, say yi, is of the form 

yi = Wj^+i^s+i{S{ai)S{ai)W)^S{ai)S{ai)wij^_^^. 

Consequently, by choosing w„+i = Wj^^^+i^s+iW we conclude that 



= 0 . ( 8 ) 

. , n, via (6) the words of the types I\, 
= —1, 0, or 1, respectively. Moreover, 



WOj/lJ/2 • • • VnVn+l = Z = UqUi . . . UnUn+1- 

Now the crucial observation is that the word yi ■ ■ .y-n consists of altogether 
2n + factors of types I, M, and T, that is of L. Hence, by (8), this word 

can be refactorized as 

yi . . .yn = vi . . .Vn with Vi G = ip{b) for i = 1, . . . , n. 

Therefore the factorization z = wuVi . . . is the required one. 

The detailed proof of the other implication can be found in the final version 
of this paper. 



4 Applications and Related Problems 

In this section we search for some applications of Theorem 2, as well as its 
strenghtenings. We first show that one of the endmarkers, say c, can be com- 
pletely eliminated in the formulation of Theorem 2, and that even both can be 
essentially eliminated. Both these results are obtained straightforwardly from 
the constructions of the proof of Theorem 2. 

Theorem 3. It is undecidable whether for two finite 1-free substitutions one is 
included in the other on the language a&+. 




394 



Juhani Karhumaki and Leonid P. Lisovik 



Of course, the language L can not be further reduced to 

Theorem 4. It is undecidahle whether for two words a and (3 and two finite 
sets C and D the following holds true: 

{a, fir C for all n > 1. 

Next we state a few applications of our main result. Recall that a finite 
substitution r : S* —>■ A* can be realized by a nondeterministic generalized 
sequential machine (ngsm for short) without any states, and that a finite sub- 
stitution T : {a, b}* — > A* restricted to the language can be realized by a 
two-state ngsm with a unary input alphabet. Indeed, the outputs associated to a 
can be associated to the reading of b combined with a change of the state (Hence 
the inputs are changed from aV into Consequently, Theorem 3 now yields 

Theorem 5. The inclusion (resp. the equivalence) problem for relations defined 
by two-state (resp. three-state) ngsm’s with a unary input alphabet is undecidable. 

In fact, in the inclusion problem of Theorem 5 one of the relations (namely 
the one which is asked to be included into the other) can be taken to be a finite 
substitution (on a unary alphabet). Therefore, the statement for the equivalence 
problem follows by considering the two-state ngsm and the union of it and the 
one-state ngsm. Hence, the equivalence remains undecidable even if one of the 
ngsm’s has only two states. We also recall that Theorem 5 and its proof tech- 
niques are essential strenghtenings of those used in [LHI], where the simulating 
transducer is required to be only a finite transducer. 

The other corollary comes from the fact that the language L = ab^ is a 
DOT language, cf. [RS]. We call a DOT language binary, if it is over a two-letter 
alphabet. 

Theorem 6. The inclusion problem of finite substitutions on binary DOT lan- 
guages is undecidable. 

As a contrast to the above theorem we recall that the equivalence of mor- 
phisms on DOT languages is decidable, cf. eg. [CuK]. However, even in the case 
of binary DOT languages the problem is not trivial, although computationally 
easy: it is enough to consider four first words of the language, cf. [K] . 

We conclude with a few remarks on our problem P 2 , which asks for two finite 
substitutions (p and ip and a rational language L whether of not <p{w) C ijj[w) 
for all w € L. Now, if ■;/: is a morphism then so must be p (or the inclusion does 
not hold), and the problem is trivially decidable. If, in turn, (/? is a morphism 
we are in a nontrivial case: In general, the problem is undecidable cf. [M], [TH] 
and [LHI], while if the language L is assumed to be of the form ab*c, or more 
generally bounded, then the problem becomes decidable, as will be shown in a 
forthcoming note. 
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Abstract. We resolve several long-standing open questions regarding 
the power of various types of finite-state automata to recognize “pic- 
ture languages,” i.e. sets of two-dimensional arrays of symbols. We show 
that the languages recognized by 4- way alternating finite-state automata 
(AFAs) are incomparable to the so-called tiling recognizable languages. 
Specihcally, we show that the set of acyclic directed grid graphs with 
crossover is AFA-recognizable but not tiling recognizable, while its com- 
plement is tiling recognizable but not AFA-recognizable. Since we also 
show that the complement of an AFA-recognizable language is tiling rec- 
ognizable, it follows that the AFA-recognizable languages are not closed 
under complementation. In addition, we show that the set of languages 
recognized by 4-way NFAs is not closed under complementation, and 
that NFAs are more powerful than DFAs, even for languages over one 
symbol. 



1 Introduction 

Two-dimensional words, or “pictures,” are rectangular arrays of symbols over 
a finite alphabet, and sets of such pictures are “picture languages.” Pictures 
can be accepted or rejected by various types of automata, and this gives rise to 
different language classes; thus picture languages form an interesting extension 
of the classical theory of one-dimensional languages and automata, and can be 
viewed as formal models of image recognition. 

In particular, let us consider finite-state automata, which recognize two- 
dimensional generalizations of the regular languages. In one dimension, we can 
define the regular languages as those recognized by finite-state automata that can 
move in one direction (1-way) or both directions (2- way) on the input, and which 

* Research supported by NSF Grant CCR 97-33101. 
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are deterministic (DFAs), non-deterministic (NFAs) or alternating (AFAs). In 
one dimension, these are all equivalent in their computational power. 

In two dimensions, natural generalizations of finite-state automata are 4-way 
finite-state automata, which at each step can read a symbol of the array, change 
their internal state, and move up, down, left or right to a neighboring symbol. 
These can be deterministic, non-deterministic or alternating. Automata of this 
kind were introduced by Blum and Hewitt [BH67]. 

Another definition of regular language that we can generalize to two dimen- 
sions is the following. A finite complement language is one defined by forbidding 
a finite number of subwords. While not every regular language is finite comple- 
ment, every regular language is the image of a finite complement language under 
an alphabetic homomorphism, i.e. a function that maps symbols from one alpha- 
bet into another (possibly smaller) one. In two dimensions, a picture language 
is called local if it can be defined by forbidding a finite number of local blocks, 
and the image of such a language under an alphabetic homomorphism is tiling 
recognizable. Without loss of generality we may assume that all forbidden blocks 
have size 2 x 2, or are 1 x 2 or 2 x 1 dominoes [GR92]. 

Tiling recognizable languages have also been called homomorphisms of local 
lattice languages or h(LLL)s [LMN98] or the languages recognizable by non- 
deterministic on-line tessellation acceptors [IN77]. We will follow [GR92] and 
denote this set of languages REG. 

While DFAs, NFAs, AFAs and REG are all equivalent to the regular lan- 
guages in one dimension, in two or more dimensions they become distinct: 



DFA C NFA 



C AFA 
C REG 



where all of these inclusions are strict. We recommend [LMN98,GR96,IT91], 
[Ros79] for reviews of these classes. A bibliography of papers in the subject is 
maintained by Borchert at [BB]. 

Note that we restrict our automata to move within the picture they are 
trying to recognize. (For DFAs, it is known that allowing them to move outside 
the picture into a field of blanks does not increase their computational power 
[Ros79]. For NFAs this is known only for 1 x n pictures [LMN98], and for AFAs 
the question is open.) However, we allow them to sense the boundary of the 
rectangle, and make different transitions accordingly. Similarly, when defining a 
language in REG we allow its local pre-image to forbid blocks containing blank 



symbols t] outside the rectangle; for instance, forbidding the block 



prevents 



the symbol a from appearing in the upper-left corner of the picture. 

A fair amount is known about the closure properties of these classes. The 
DFAs, NFAs, AFAs and REG are all closed under intersection and union using 
straightforward constructions. DFAs are also closed under complementation by 
an argument of Sipser [Sip80] which allows us to remove the danger that a DFA 
might loop forever and never halt. We construct a new DFA that starts in the 
final halt state, which we can assume without loss of generality is in the lower 
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right-hand corner. Then this DFA does a depth- first search backwards, attempt- 
ing to reach the initial state of the original DFA, and using the original DFA’s 
transitions to backtrack along the tree. This gives a loop- free DFA which accepts 
if and only if the original DFA accepts, and since a loop-free DFA always halts, 
we can then switch accepting and rejecting states to recognize the complement 
of the original language. Furthermore, it is known that RFC is not closed un- 
der complementation [Sze92], even for languages over a unary alphabet [MatOO, 
Thm. 2.26]. 

In contrast, up to now it has been an open question whether the 4- way NFA 
and AFA language classes are closed under complementation [GR96,LMN98j. In 
this paper we resolve both these questions in the negative, thus completing the 
following matrix of Boolean closure properties: 





n 


U 


CO 


DFA 


yes 


yes 


yes 


NFA 


yes 


yes 


no 


AFA 


yes 


yes 


no 


REG 


yes 


yes 


no 



Furthermore, the relationship between REG and AFA has been open up to now 
[IT91]. Here we show that the complement of an AFA language is tiling recog- 
nizable and that the classes REG and AFA are incomparable, i.e. 

co-AFA c REG, AFA (/i REG, and REG (/L AFA. 

Specifically, the set of acyclic directed grid graphs with crossover is in AFA but 
not REG, and its complement is in REG but not AFA. 

We also explore picture languages over a unary alphabet. Such pictures are 
unmarked rectangles and they can be identified with pairs of positive integers 
indicating the width and the height of the rectangle. In the final section of the 
paper we show that NFAs are not closed under complementation, and that the 
inclusions DFA C NFA C AFA are strict even for unary alphabets. 

2 Alternation and Tiling Recognizable Picture Languages 

Recall that an alternating finite-state automaton has existential and universal 
states. A computation that meets a universal (resp. existential) state accepts 
if every transition (resp. at least one transition) from that state leads to an 
accepting computation. Thus an NFA is an AFA with only existential states. 

In this section we prove that AFA and REG are incomparable, by first proving 
that the complement of an AFA-recognizable language is in REG. To illustrate 
the idea of the proof, consider a 4- way NFA A over an arbitrary alphabet. A 
configuration consists of the state and position of A. By definition, A does not 
accept a picture if and only if an accepting state cannot be reached from the 
initial configuration, that is, iff every possible computation either goes on indefi- 
nitely or halts in a non-accepting state. This is clearly equivalent to the existence 
of a set C of configurations with the following properties: 
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(i) the initial configuration is in C 

(ii) all possible immediate successors of all configurations of C are in C 

(iii) there is no accepting configuration in C. 

Let A’s set of internal states be S. For a given cell of the input picture, call a state 
in S “reachable” if C contains configurations at that cell in that state. Recall 
that a language in REC is the image of a local language under some alphabetic 
homomorphism; this local language can be over a larger alphabet, and so it can 
include “hidden variables” in each cell, including one bit for each s € S which 
indicates whether s is reachable at that cell. Since configurations only make 
transitions between neighboring cells, conditions (i)-(iii) can then be checked 
locally. If these hidden variables are then erased by the homomorphism, we have 
a tiling recognizable picture language that contains exactly those pictures that 
are not accepted by the NFA. 

This construction can be generalized to alternating finite automata: 

Theorem 1. The complement of every AFA recognizable picture language is 
tiling recognizable, so co-AFA C REC. 

Proof. Let A be an AFA with internal states S which accepts a picture language 
L over an alphabet E. A picture P is not accepted by A if and only if there 
exists a set C of configurations such that 

(i) the initial configuration is in C 

(ii) if c G C is existential then all possible successors of c are in C 

(iii) if c G C is universal then at least one of its immediate successors is in C 

(iv) there is no accepting configuration in C . 

(Notice that we have adopted the convention that a universal state with no 
successors accepts.) We construct a local pre-image for P over an expanded 
alphabet S x {0, 1}'®, so that each cell contains two variables: the symbol of 
the input, and lAj bits indicating which states are reachable at that cell in C. 
It is easy to see that we can verify conditions (i)-(iv) by forbidding a finite set 
of local blocks. We then use an alphabetic homomorphism from this expanded 
alphabet into E that erases the second variable, giving us a tiling system that 
accepts exactly those pictures that A does not accept. Thus L is in REC. q 

Now we know that complements of AFA languages are tiling recognizable. 
What about AFA languages themselves? This construction fails since a tiling 
system cannot make sure the AFA’s computation path contains no loops. In the 
case of an NFA, loops can be prevented locally by demanding that each config- 
uration has a unique predecessor, and therefore the construction can be saved 
by storing this predecessor in the hidden variables as well. But in the presence 
of universal states the same configuration may appear in different branches of 
a computation tree with different predecessors. Similarly, the complement of a 
language recognized by an 7T^ AFA, i.e. one with only universal states, is not 
necessarily recognized by an NFA, since inputs might be rejected by loopy com- 
putation paths which never halt. 
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In fact, in the following we demonstrate that there exist AFA languages that 
are not tiling recognizable. The basic reason is the inability of tiling systems 
to recognize directed acyclic graphs. To illustrate the idea, we first consider a 
picture language Lp consisting of pictures that represent two identical permu- 
tations. A permutation of n elements is represented as an n x n picture of Os 
and Is such that every row and every column of the picture contains exactly one 
symbol 1. Two identical permutations are concatenated and a single column of 
2s is placed between them as a separator. This way pictures of size (2n -|- 1) x n 
are obtained. For example. 



1 


0 


0 


2 


1 


0 


0 


0 


0 


1 


2 


0 


0 


1 


0 


1 


0 


2 


0 


1 


0 



is a picture in the language Lp. 

Lemma 1. Lp is not tiling recognizable. 

Proof. Assume the contrary, that Lp is the image of a local language L over k 
symbols under some alphabetic homomorphism. Without loss of generality we 
may assume that the forbidden blocks of L have size 2x2. Let n be large enough 
for n! > fc" to hold. There are n! different permutations of size n so L contains 
n\ pictures of size (2n -I- 1) x n with different images. The pigeonhole principle 
states that two of the pictures must match in the middle column. By combining 
the left- and righthand sides of the two pictures we obtain a new element of L 
whose image is not in Lp: the left- and righthand sides of the image represent 
two different permutations. q 

We note that similar counting arguments are used in [Sze92,GR96] to show 
that REG is not closed under complementation. 

Lemma 2. Lp is accepted by a 4^-way AFA whose states are all universal. 

Proof. An AFA, or in fact a DFA, can easily verify that the given picture consists 
of two permutations separated by a column of 2s: by scanning left to right and 
top to bottom, it verifies that each column except the middle one contains exactly 
one 1, and each row contains exactly one 1 on each side of the column of 2s. 

Using universal states we then verify that the two permutations are identical. 
To do this, we try all possibilities of moving in the array in the following fashion: 

(*) From a 1 move right to another column but stay on the same side of the 
wall of 2s. Find the 1 on that column. Then move to the other 1 that is on 
the same row, on the opposite side of the wall, and repeat. 

The picture is accepted if the automaton gets stuck, that is, if it is on the 
rightmost column and is requested to find another column to the right. 
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If the two permutations are different then there is a non-accepting infinite 
loop that repeats instruction (*) indefinitely. There are namely two rows whose 
corresponding columns are in different order on the two sides of the wall, which 
allows an infinite loop: 



1 - 




1 


i * 

1 — 


- 


i 



Conversely, if the two permutations are identical then all alternatives lead to 
a halting accepting computation as two repetitions of (*) always move the au- 
tomaton at least two positions to the right. Notice that the distance between 
the two Is on the same row is constant, so the horizontal movements across the 
wall cancel each other in two rounds. The remaining instructions move the au- 
tomaton at least one column to the right on both sides of the wall. This can be 
continued only until the rightmost column is reached, whereupon the automaton 
gets stuck and accepts. □ 

This gives us the other main results of this section: 

Theorem 2. Language Lp is in AFA but not in RFC. The complement of Lp 
is in REC but not in AFA. Therefore, AFA and REC are incomparable, and 
AFA is not closed under complementation. q 

Proof. The first part was proved in Lemma 1 and Lemma 2. The complement of 
Lp is not recognized by any AFA, because if it were then according to Theorem 1 
its complement Lp would be in REC. On the other hand, by Theorem 1 the 
complement of Lp is in REC since Lp is in AFA. q 

Note that Theorem 2 holds even for 77^ AFAs, since the AFA in Lemma 2 
has only universal states. 

More generally, let us consider the picture language of acyclic directed grid 
graphs, suitably encoded with a finite alphabet. Lemma 2 clearly still holds since 
an AFA with only universal states can verify acyclicity by following all possible 
paths in the graph and accepting when they arrive at a sink, while Theorem 1 
applies to its complement since a tiling system can guess a cycle and mark it 
with hidden states. 

If crossover is allowed, then Lemma 1 holds as well: the idea is to divide the 
picture into two halves with n nodes along the boundary between them. The 
left half (say) induces a relation ^ between these boundary nodes, where we say 
a A 6 if & is reachable from a by a directed path in that half. If this half has no 
cycles by itself, then ^ is a partial order. 

Now consider the set of n(n — 1) right halves consisting of a single directed 
path from a to b; combining each of these with the left half produces a cycle if 
and only if a A &. Therefore, any two partial orders which differ for some pair 
a, b differ on which right halves will create a cycle, and so each partial order 
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yields a different equivalence class of which right halves are allowed. Since the 
number of partial orders is at least the number of total orders n\, and since all 
of these are easily achieved by a grid graph with crossover, any local language 
will run out of states for the interface and we again get a contradiction by the 
pigeonhole principle. The language Lp above simply restricts to the case where 
these partial orders are the total orders associated with permutations. 

Thus we have 

Theorem 3. The picture language of acyclic directed grid graphs with crossover 
is in AFA but not in RFC, and its complement is in REC but not in AFA. 

However, in the planar case where we disallow crossover, the induced partial 
orders correspond to outerplanar directed graphs. Since the number of these 
grows only exponentially in n, this pigeonhole argument fails. We conjecture, in 
fact, that the set of planar acyclic directed grid graphs is in REC. 



3 Four- Way Finite Automata Over a Unary Alphabet 

In this section, we give an example of a rectangle set S that can be recognized 
by a 4-way NFA, but not by any DFA. Moreover, we show that the complement 
of S is recognized by an AFA but not by any NFA. This proves that, unlike the 
deterministic case, the class of NFA-recognizable picture languages is not closed 
under complementation, even for pictures over a unary alphabet. 

Our main tool in proving these results is to interpret two-dimensional au- 
tomata as two-way one-dimensional automata by fixing the height of the rectan- 
gles and letting the width vary. This approach has become standard, e.g. [MatOO]. 
Our variant of two-way finite automata can detect when they are reading the 
first or the last symbol of the input and can make different transitions accord- 
ingly. They may move left or right, or remain stationary. They cannot move 
beyond the ends of the input. A word is accepted iff a final state can be reached. 

For a unary alphabet, pictures are just unmarked rectangles, which we can 
identify with their width and height (w, h). Then we have the following: 

Lemma 3. Let S C W' be the set of rectangles recognized by a k-state 4~way 
NFA A. Then for every height h there exists a two-way NFA B with kh states 
recognizing the language 

{\^\{w,h)&S} 

of corresponding widths. Moreover, if A is deterministic then B is deterministic. 

Proof. The states of B are pairs (i, s) where i is an integer, 1 < i < h, repre- 
senting the current vertical position of A in the rectangle, and s is the current 
state of A. The position of B represents the horizontal position of A inside the 
rectangle. It is easy to interpret A’s transition rules as B’s transition rules, so 
that B will simulate A step-by-step. If A is deterministic then B is also. □ 
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The following well-known “pumping lemma” allows us to prove that certain 
languages cannot be accepted by any two-way NFA or DFA with a given number 
of states: 

Lemma 4. Let A he a two-way NFA with k states over the single-symbol alpha- 
bet {!}, accepting a language L C 1*. Then, for every n > k -\-2, 

1" e L 1”+'=' e L 

Moreover, if A is deterministic then this implication holds in both directions. 

Proof. Let n > k-\- 2 and consider an accepting computation C of input 1". Let 
us divide C into segments between consecutive visits of the NFA at the endpoints 
of the input. The head has to move through all intermediate positions during any 
segment S oi C from one end to the other. There are n — 2> k-\-l intermediate 
positions. Let si, S 2 , ■ • ■ , Sn -2 be the states of A when it enters the intermediate 
positions l,2,...,n — 2, respectively, for the first time within segment S. It 
follows from the pigeonhole principle that two of the states si, S 2 , . . . , Sfc+i must 
be the same, say Si = Sj+j. The computation between positions i and i-\- 1 can 
then be repeated arbitrarily many times, taking the NFA into position i-\- jt for 
any j > 0, and remaining in state st. 

Because t divides k\ this means that input 1"+*' is accepted by a computation 
that is identical to C except that in any segment of C from one end to the other 
a loop of length t is repeated kl/t times. 

If A is deterministic then each input has a unique computation C. If n > k-{-2 
and I" is not accepted (either A halts in non-accepting state or loops forever) 
then the same construction as above yields a non-accepting computation (halting 
or looping) for input In other words, in the deterministic case we have 

1" e L G L. g 

This gives us the immediate corollary: 

Corollary 1. If L is a finite language recognized by a two-way NFA with k 
states, then its longest word has length at most k -\- 2. 

Now we are ready to prove the main results of this section. Consider the 
following set of rectangles: 

S = {(w, h) \ w = ih -\- j{h -\- 1) for some non-negative integers i and j } 

= {(w, h)\w = ih-\- j for some 0 < j < i }. 

For any given height h the set of allowed widths w is the union of contiguous 
segments 

ih,ih -\- 1, . . . ,ih -\- i 

for all i = 0, 1, It is easy to see that the largest width w that is not allowed 

is h'^ — h — 1. (It is the only integer between the segments for i = h — 2 and 
i = h — 1. For larger values of i the consecutive segments overlap.) 
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The set S can be easily recognized by a billiard-ball-like NFA that sets out 
at a 45° angle from the lower left corner of the rectangle. When the ball hits 
the upper or lower edge of the rectangle it either bounces back immediately, 
or moves one cell to the right along the edge before bouncing, as in Figure 1. 
The rectangle is accepted if the ball is in either corner of the rectangle when it 
reaches the right edge. 




However, the complement S is not accepted by any NFA. Notice that S 
contains a finite number of rectangles of any given height h, the largest of which 
has width w = h?' — h — 1. Assume that S is accepted by a 4- way NFA with 
k states. According to Lemma 3 there exists a two-way NFA B with kh states 
that accepts the finite language 

L = {l^\{w,h) gS}. 

If we choose the height h such that h? — h—1 > kh + 2, then the longest word in 
L has length greater than the number of states in B plus two. This contradicts 
the corollary to Lemma 4, so we have proved 

Theorem 4. The NFA-recognizahle picture languages are not closed under com- 
plementation, even for a unary alphabet. 

Since the DFA-recognizable languages are closed under complementation 
[SipSO] as discussed above, it follows that S is not in DFA. We can also show 
this directly using the deterministic variants of Lemma 3 and Lemma 4. 

Furthermore, it is easy to see that if a language is recognized by a loop-free 
NFA, then its complement is accepted by an AFA whose states are all universal. 
Here, for instance, the AFA moves diagonally at a 45° angle from the lower left 
corner. When it hits the upper or lower edge of the rectangle a universal state 
splits the computation into two parts: one in which the AFA bounces from the 
edge immediately and one in which the bounce is delayed by one step to the 
right. Both alternatives must lead to accepting computations. A computation is 
accepting if the AFA is not in the corner when it reaches the right edge of the 
rectangle. Since S is in NFA but not DFA, and S is AFA but not in NFA, we 
have proved 
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Theorem 5. The inclusions DFA C NFA C AFA are strict even for picture 
languages over a unary alphabet. 

4 Conclusions 

We have solved several open problems in the field of two-dimensional finite-state 
automata and picture languages. In particular, we have shown that the NFA- 
and AFA-recognizable picture languages are not closed under complementation, 
that the complement of an AFA-recognizable language is in the set RFC of tiling 
recognizable languages, and that AFA and RFC are incomparable. In fact, since 
the AFA in Lemma 2 has only universal states, even this restriction of AFA is 
incomparable with RFC. All these results generalize easily to more dimensions, 
so they hold for d-dimensional picture languages whenever d > 2. 

Some authors have studied 3-way automata, which are only allowed to move 
(say) up, down, and right. The NFA for S in Theorem 4 is in fact 3-way, so we 
can conclude that there exists a 3-way NFA language that is not recognized by 
any 4-way DFA, and whose complement is not accepted by any 4-way NFA. In 
addition, the AFA for S in Theorem 5 is 3-way so there also exists a 3-way AFA 
language that is not accepted by any 4- way NFA. 

Finally, we leave as an open question our conjecture that the picture lan- 
guage of planar acyclic directed graphs is tiling recognizable. Other interesting 
questions include whether AFA is not closed under complementation for a unary 
alphabet, and whether RFC and co-AFA are distinct. 
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Abstract. A dichotomy theorem for a class of decision problems is a result as- 
serting that certain problems in the class are solvable in polynomial time, while 
the rest are NP-complete. The first remarkable such dichotomy theorem was 
proved by T.J. Schaefer in 1978. It concerns the class of generalized satisfiability 
problems Sat(S'), whose input is a CNF(S')-formula, i.e., a formula constructed 
from elements of a fixed set S of generalized connectives using conjunctions and 
substitutions by variables. Here, we investigate the complexity of minimal satisfi- 
ability problems MiN Sat(S'), where S' is a fixed set of generalized connectives. 
The input to such a problem is a CNF (S)-formula and a satisfying truth assign- 
ment; the question is to decide whether there is another satisfying truth assign- 
ment that is strictly smaller than the given truth assignment with respect to the 
coordinate-wise partial order on truth assignments. Minimal satisfiability prob- 
lems were first studied by researchers in artificial intelligence while investigating 
the computational complexity of propositional circumscription. The question of 
whether dichotomy theorems can be proved for these problems was raised at that 
time, but was left open. In this paper, we settle this question affirmatively by 
establishing a dichotomy theorem for the class of all MlN SAT(S')-problems. 



1 Introduction and Summary of Results 

Computational complexity strives to analyze important algorithmic problems by first 
placing them in suitable complexity classes and then attempting to determine whether 
they are complete for the class under consideration or they actually belong to a more 
restricted complexity class. This approach to analyzing algorithmic problems has borne 
fruit in numerous concrete cases and has led to the successful development of the theory 
of NP-completeness. In this vein, dichotomy theorems for classes of NP-problems are 
of particular interest, where a dichotomy theorem is a result that concerns an infinite 
class T of related decision problems and asserts that certain problems in T are solvable 
in polynomial time, while on the contrary all other problems in T are NP-complete. It 
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should be pointed out that the a priori existenee of diehotomy theorems eannot not be 
taken for granted. Indeed, Ladner [17] showed that if P ^ NP, then there are problems 
in NP that are neither NP-eomplete nor in P Consequently, a given elass T of NP- 
problems may contain such problems of intermediate complexity, which rules out the 
existence of a dichotomy theorem for T . 

The first remarkable (and highly non-trivial) dichotomy theorem was established by 
Schaefer [22], who introduced and studied the class of GENERALIZED Satisfiability 
problems (see also [8, L06, page 260]). A logical relation (or generalized connective) 
i? is a non-empty subset of {0, 1}^, for some k > 1. If S' = {i?i, . . . , Rm} is a finite 
set of logical relations, then a CNF (S)-formula is a conjunction of expressions (called 
generalized clauses or, simply, clauses) of the form , . . . , Xfc), where each i?' is a 
relation symbol representing the logical relation Ri in S and each Xj is a Boolean vari- 
able. Each finite set S of logical relations gives rise to the GENERALIZED SATISFIABIL- 
ITY problem Sat(S): given a CNF (S) -formula (p, is p satisfiable? Schaefer isolated 
six efficiently checkable conditions and proved the following dichotomy theorem for 
the class of all GENERALIZED Satisfiability problems Sat(S'): if the set S satisfies 
at least one of these six conditions, then Sat(5') is solvable in polynomial time; other- 
wise, Sat(S') is NP-complete. Since that time, only a handful of dichotomy theorems 
for other classes of decision problems have been established. Two notable ones are the 
dichotomy theorem for the class of Fixed Subgraph Homeomorphism problems 
on directed graphs, obtained by Fortune, Hocroft and Wyllie [6], and the dichotomy the- 
orem for the class of iJ-COLORiNG problems on undirected graphs, obtained by Hell 
and Nesetfil [9]. The latter is a special case of CONSTRAINT SATISFACTION, a rich 
class of problems that have been the object of systematic study in artificial intelligence. 
It should be noted that no dichotomy theorem for the entire class of CONSTRAINT SAT- 
ISFACTION problems has been established thus far, in spite of intensive efforts to this 
effect (see Feder and Vardi [7], Jeavons, Cooper and Gyssens [11]). 

In recent years, researchers have obtained dichotomy theorems for optimization 
problems, counting problems, and decision problems that are variants of GENERAL- 
IZED Satisfiability problems. Creignou [4], Khanna, Sudan and Williamson [14], 
Khanna, Sudan and Trevisan [13], and Zwick [23] obtained dichotomy theorems for cer- 
tain classes of optimization problems related to propositional satisfiability and Boolean 
constraint satisfaction, Creignou and Hermann [3] proved a dichotomy theorem for the 
class of counting problems that ask for the number of satisfying assignments of a given 
CNF (S') -formula, and Kavvadias and Sideri [12] established a dichotomy theorem for 
the class of decision problems Inverse Sat(S) that ask whether a given set of truth 
assignments is the set of all satisfying assignments of some CNF (S)-formula, where S 
is a finite set of logical relations. Even more recently, Reith and Vollmer [21] proved a 
dichotomy theorem for the class of optimization problems LexMin Sat(S') and Lex- 
Max Sat(S') that ask for the lexicographically minimal (or maximal) truth assignment 
that satisfies a given CNF (S') -formula. In addition, Istrate [10] investigated the exis- 
tence of a dichotomy for the restriction of generalized satisfiability problems in which 
each variable appears at most twice. 

Researchers have also investigated the class of decision problems MiN Sat(S), 
where S is a finite set of logical relations. For a fixed S, the input to the problem is a 
CNF (S) -formula p and a satisfying truth assignment a of p', the question is to decide 
whether there is another satisfying truth assignment fi oip such that (3 < a, where < is 
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the coordinate-wise partial order on truth assignments. These decision problems were 
introduced and studied by researchers in artificial intelligence while investigating cir- 
cumscription, a well-developed formalism of common-sense reasoning introduced by 
McCarthy [19] about twenty years ago. The main question left open about MlN Sat(S') 
was whether a dichotomy theorem holds for the class of all MlN Sat(S') problems, 
where S' is a finite set of logical relations. In the present paper, we settle this question in 
the affirmative and also provide easily checkable criteria that tell apart the polynomial- 
time solvable cases of Min Sat(S) from the NP-complete ones. 

In circumscription, properties are specified using formulas of some logic, a natural 
partial order between models of each formula is considered, and preference is given 
to models that are minimal with respect to this partial order. McCarthy’s key intuition 
was that minimal models should be preferred because they are the ones that have as 
few “exceptions” as possible and thus embody common-sense. A fundamental algorith- 
mic problem about every logical formalism is model checking, the problem of deciding 
whether a finite structure satisfies a formula. As regards circumscription, model check- 
ing amounts to the problem of deciding whether a finite structure is a minimal model 
of a formula. The simplest case of circumscription is propositional circumscription, 
where properties are specified using formulas of propositional logic; thus, the model 
checking problem for propositional circumscription is precisely the problem of decid- 
ing whether a satisfying truth assignment of a propositional formula is minimal with 
respect to the coordinate-wise order. Clearly, this problem is equivalent to the comple- 
ment of the minimal satisfiability problem; moreover, it is not hard to show that this 
problem is coNP-complete, when arbitrary propositional formulas are allowed as part 
of the input. For this reason, researchers in artificial intelligence embarked on the pur- 
suit of tractable cases of the model checking problem for propositional circumscription. 
In particular, Cadoli [1,2] adopted Schaefer’s approach, introduced the class of deci- 
sion problems MiN Sat(S'), identified several tractable cases, and raised the question 
of the existence of a dichotomy theorem for this class (see [2, page 132]). Moreover, 
Cadoli pointed out that if a dichotomy theorem for MlN Sat(5') indeed exists, then 
the dividing line is going to be very different from the dividing line in Schaefer’s di- 
chotomy theorem for Sat(S'). To see this, consider first the set S = {Ri/^}, where 
^ 1/3 = {(Ij 0: 0); (0) 1; 0)) (0; 0) !)}■ this casc, Sat(S') is the well-known NP- 
complete problem POSITIVE- 1 -In-3 - Sat, while on the contrary Min Sat(S') is trivial, 
since it can be easily verified that every satisfying truth assignment of a given CNF (S') - 
formula is minimal. Thus, an intractable case of Sat(S) becomes a tractable (in fact, a 
trivial) case of MlN Sat(S). In the opposite direction, Cadoli [1,2] showed that certain 
tractable (in fact, trivial) cases of Sat(S) become NP-complete cases of Min Sat(S). 
Specifically, one of the six tractable cases in Schaefer’s dichotomy theorem is the case 
where S consists entirely of l-valid logical relations, that is, every relation Rin S con- 
tains the all-ones tuple (!,...,!) (and, hence, every CNF(S)-formulais satisfied by the 
truth assignment that assigns 1 to every variable). In contrast, Cadoli [1,2] discovered a 
finite set S of l-valid relations such that MiN Sat(S') is NP-complete. 

As it turns out, the collection of l-valid relations holds the key to the dichotomy 
theorem for MiN Sat(S'). More precisely, we first establish a dichotomy theorem for 
the class of Min Sat(S') problems, where S' is a finite set of l-valid relations. Using 
this restricted dichotomy theorem as a stepping stone, we then derive the desired di- 
chotomy theorem for the full class of MiN Sat(S) problems, where S is a finite set 
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of arbitrary logical relations. Note that all dichotomy theorems described thus far in- 
volve CNF (S') -formulas that do not contain the constant symbols 0 and 1 ; Schaefer 
[22], however, also proved a dichotomy theorem for CNF (S) -formulas with constant 
symbols. Here, we derive dichotomy theorems for minimal satisfiability of CNF(S) 
formulas with constant symbols as well. Our results differ from earlier dichotomy the- 
orems for satisfiability problems in two major aspects. First, in all earlier dichotomy 
theorems the tractable cases arise from conditions that are directly applied to the set 
S of logical relations under consideration; in our main dichotomy theorem, however, 
the tractable cases arise from conditions that are applied not to the set S of logical re- 
lations at hand, but to a certain set S* of 1-valid logical relations obtained from S by 
projecting the relations in S' in a particular way. Second, the proofs of essentially all 
earlier dichotomy theorems for satisfiability problems used Schaefer’s dichotomy the- 
orem; furthermore, they often hinged on stronger versions of what has become known 
as Schaefer’s expressibility theorem [22, Theorem 3.0, page 219], which asserts that 
if S does not satisfy at least one of the six conditions that give rise to tractable cases 
of Sat(S'), then every logical relation is definable from some CNF (5')-formula using 
existential quantification and substitution by constants. The proof of our dichotomy 
theorem for MlN Sat(S'), however, hinges on new and rather delicate expressibility 
results that provide precise information about the way particular logical relations, such 
as the implication connective, are definable from CNF (S') -formulas using existential 
quantification and substitution by constants. 

Researchers in artificial intelligence have also investigated various powerful exten- 
sions of circumscription in which the partial order among models of a formula is mod- 
ified, so that some parts of the model are assigned fixed values and some other parts 
are allowed to vary arbitrarily [18,20]. We have been able to establish dichotomy the- 
orems for the model checking problem for most such extensions of propositional cir- 
cumscription, thus answering another question left open by Cadoli [1,2]. These results 
are contained in the full version of the present paper, available in electronic form [15]. 

2 Preliminaries and Background 

This section contains the definitions of the main concepts used in this paper and a 
minimum amount of the necessary background material from Schaefer’s work on the 
complexity of Generalized Satisfiability problems [22]. 

Definition 1 . Let S = {Ri, , Rm} be a finite set of logical relations of various 
arities, let S" = {R[, ... , i?(„} be a set of relation symbols whose arities match those 
of the relations in S, and let V be an infinite set of variables. 

A CNF (S')-formula is a finite conjunction Ci A . . . A Cn of clauses built using rela- 
tion symbols from S' and variables from V, that is, each Ci is an atomic formula of the 
form R'j{xi, . . . ,Xk), where i?' is a relation symbol of arity k in S', and xi, . . . ,Xk are 
variables in V. A CNFc(<S')-formula is a formula obtained from a CNF(S')-formula 
by substituting some of its variables by the constant symbols 0 or 1 . The semantics of 
CNF (S') -formulas and CNFc(S)-formulas are defined in a standard way by assuming 
that variables range over the set of bits {0, 1}, each relation symbol i?' in S' is inter- 
preted by the corresponding relation Rj in S, and the constant symbols 0 and 1 are 
interpreted by 0 and 1 respectively. 
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Sat(S') is the following decision problem: given a CNF (S')-formula Lp, is it satisfi- 
able? (i.e., is there a truth assignment to the variables of (/? that makes every clause of (/? 
true?) The decision problem SaTc(S') is defined in a similar way. | 

It is clear that, for each finite set S of logical relations, both Sat(5') and Satc(>S') 
are problems in NR Moreover, several well-known NP-complete problems and sev- 
eral important tractable cases of Boolean satisfiability can easily be cast as Sat(S') 
problems for particular sets S of logical relations. Indeed, we already saw in the previ- 
ous section that the NP-complete problem Positive- 1 -In-3 -Sat ([8, L04, page 259]) 
is precisely the problem Sat(S'), where S is the singleton consisting of the relation 
-^ 1/3 = {(1) Oj 0)) (0, 0)> (0, 0, !)}■ Moreover, the prototypical NP-complete prob- 
lem 3-Sat coincides with the problem Sat(S'), where S = {Rq,R\,R 2 ,Rz\ and 
i?o = {0, 0) 0)} (expressing the clause (ccVyVz)), i?i = {0, 1}^ — {(1, 0, 0)} 

(expressing the clause V y V z}), R 2 = {0, 1}^ — {(1, 1, 0)} (expressing the clause 
{^xV z)), and R 3 = {0,1}^ — {(1,1,1)} (expressing the clause ~>z)). 

Similarly, but on the side of tractability, 2-Sat is precisely the problem Sat(S'), where 
S = {Ro,Ri,R 2 } and Rq = {0,1}^ — {(0,0)} (expressing the clause {x V y)), 
R\ = {0, 1}^ — {(1, 0)} (expressing the clause (^xVy)), and R 2 = {0, 1}^ — {(1, 1)} 
(expressing the clause {^x V -^y)). 

The next two definitions introduce the key concepts needed to formulate Schaefer’s 
dichotomy theorems. 

Definition 2. Let 1 ^ be a propositional formula. 

Lp is l-valid if it is satisfied by the truth assignment that assigns 1 to every variable. 
Similarly, (p is 0-valid if it is satisfied by the truth assignment that assigns 0 to every 
variable. 

(p is bijunctive if it is a 2CNF-formula, i.e., it is a conjunction of clauses each of 
which is a disjunction of at most two literals (variables or negated variables). 

(p is Horn if it is the conjunction of clauses each of which is a disjunction of literals 
such that at most one of them is a variable. Similarly, (p is dual Horn if it is the con- 
junction of clauses each of which is disjunction of literals such that at most one of them 
is a negated variable. 

(p is affine if it is the conjunction of subformulas each of which is an exclusive 
disjunction of literals or a negation of an exclusive disjunctions of literals (by definition, 
an exclusive disjunction of literals is satisfied exactly when an odd number of these 
literals are true; we will use 0 as the symbol of the exclusive disjunction). Note that 
a formula ip is affine precisely when the set of its satisfying assignments is the set of 
solutions of a system of linear equations over the field {0, 1}. | 



Definition 3. Let i? be a logical relation and S a finite set of logical relations. 

R is l-valid if it contains the tuple (1,1,..., 1), whereas R is 0-valid if it contains 
the tuple (0, 0, . . . , 0). We say that S is l-valid (0-valid) if every member of S is l-valid 
(0-valid). 

R is bijunctive (Horn, dual Horn, or affine, respectively) if there is a propositional 
formula (p which is bijunctive (Horn, dual Horn, or affine, respectively) and such that 
R coincides with the set of truth assignments satisfying (p. 
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S is Schaefer if at least one of the following four conditions hold: every member 
of S is bijunctive; every member of S is Horn; every member of S is dual Horn; every 
member of S is affine. Otherwise, we say that S is non-Schaefer. | 

There are simple criteria to determine whether a logical relation is bijunctive, Horn, 
dual Horn, or affine. In fact, a set of such criteria was already provided by Schaefer 
[22]; moreover, Dechter and Pearl [5] gave even simpler criteria for a relation to be 
Horn or dual Horn. Each of these criteria involves a closure property of the logical 
relations at hand under a certain function. Specifically, a relation R is bijunctive if and 
only if for all ti,t, 2 ,t 3 £ R, we have that {ti V ^ 2 ) A (^2 V ^ 3 ) A (ti V ts) £ R, 
where the operators V and A are applied coordinate-wise to the bit-tuples. Note that 
the i-th coordinate of the tuple (fi V ^ 2 ) A (f 2 V t^) A {ti V t^) is equal to 1 exactly 
when the majority of the i-th coordinates of fi, ^ 2 , fa is equal to 1. Thus, this criterion 
states that R is bijunctive exactly when it is closed under coordinate-wise applications 
of the ternary majority function. R is Horn (respectively, dual Horn) if and only if for 
all fi , f2 G R, we have that ti A ^2 G R (respectively, fi V ^2 G R). Finally, R is affine if 
and only if for all fi , ^2 , fa G R, we have that f 1 0 f 2 © f a G R. As an example, it is easy 
to apply these criteria to the ternary relation R\/^ = {( 1 , 0 , 0 ), ( 0 , 1 , 0 ), ( 0 , 0 , 1 )} and 
verify that i?i /3 is neither bijunctive, nor Horn, nor dual Horn, nor affine; moreover, it 
is obvious that i?i /3 is neither 1-valid nor 0-valid. Finally, there are polynomial-time 
algorithms that given a logical relation that is bijunctive (Horn, dual Horn, or affine, 
respectively), produce a defining propositional formula which is bijunctive (Horn, dual 
Horn, or affine, respectively). See [5,16]. 

If S' is a 0-valid or a 1-valid set of logical relations, then Sat(S) is a trivial decision 
problem (the answer is always “yes”)- If S is an affine set of logical relations, then 
Sat(S) can easily be solved in polynomial time using Gaussian elimination. Moreover, 
there are well-known polynomial-time algorithms for the satisfiability problem for the 
class of all bijunctive formulas (2-Sat), the class of all Horn formulas, and the class of 
all dual Horn formulas. Schaefer’s seminal discovery was that the above six cases are 
the only ones that give rise to tractable cases of Sat(S'); furthermore, the last four are 
the only ones that give rise to tractable cases of SaTc(>S'). 

Theorem 4 . [Schaefer’s Dichotomy Theorems, [22]] Let S be a finite set of logical 
relations. 

If S is 0-valid or l-valid or Schaefer, then Sat(S') is solvable in polynomial time; 
otherwise, it is NP-complete. 

If S is Schaefer, then SaTc(S') is solvable in polynomial time; otherwise, it is is 
NP-complete. 

As an application. Theorem 4 immediately implies that Positive-1-In-3-Sat is NP- 
complete, since this is the same problem as SAT(i?i/ 3 ), and i?i /3 is neither 0 -valid, nor 
1-valid, nor Schaefer. 

To obtain the above dichotomy theorems, Schaefer had to first establish a result con- 
cerning the expressive power of CNFc(S') formulas. Informally, this result asserts that 
if S' is a non-Schaefer set of logical relations, then CNF c(S) -formulas have extremely 
highly expressive power, in the sense that every logical relation can be defined from a 
CNF c(S) -formula using existential quantification. 
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Theorem 5 . [Schaefer’s Expressibility Theorem, [22]] Let S be a finite set of logi- 
cal relations. If S is non-Schaefer, then for every k-ary logical relation R there is a 
CUV c{S) -formula . . . ,Xk, zi, . . . , Zm) such that R coincides with the set of all 
truth assignments to the variables x\, . . . ,Xk that satisfy the formula (3z)(p(x, z). 



3 Dichotomy Theorems for Minimal Satisfiability 

In this section, we present our main dichotomy theorem for the class of all minimal sat- 
isfiability problems MiN Sat(S'). We begin with the precise definition of Min Sat(5'), 
as well as of certain variants of it that will play an important role in the sequel. 

Definition 6. Let < denote the standard total order on {0, 1}, which means that 0 < 1. 

Let fc be a positive integer and let a = (m, . . . , Ofc), /3 = (6i, . . . , 6^) be two k- 
tuples in {0, 1}^. We write (3 < ato denote that bi < Oj, for every i < k. Also, (3 < a 
denotes that (3 < a and (3 a. 

Let S' be a finite set of logical relations. MiN Sat(S) is the following decision 
problem: given a CNF (S)-formula Lp and a satisfying truth assignment a of p>, is there 
a satisfying truth assignment (3 oi tp such that /3 < a? In other words, MlN Sat(S) 
is the problem to decide whether or not a given truth assignment of a given CNF(S)- 
formula is minimal. The decision problem MlN S ATc (S) is defined in a similar way by 
allowing CNF c(S) -formulas as part of the input. 

Let S be a 1-valid set of logical relations. 1-MiN Sat(S') is the following decision 
problem: given a CNF (S') -formula p (note that p is necessarily 1-valid), is there a 
satisfying truth assignment of p that is different (and, hence, smaller) from the all-ones 
truth assignment (!,...,!)? 

A CNF 1 (S)-formula is obtained from a CNF (S)-formula by replacing some of its 
variable by the constant symbol 1. The decision problem 1-MiN SaTi(S) is defined 
the same way as 1 -MlN Sat(S), except that CNF i (S)-formulas are allowed as part of 
the input (arbitrary CNF c(S) -formulas are not allowed, since substituting variables by 
0 may destroy 1-validity). | 

As mentioned earlier, Cadoli [1,2] raised the question of whether a dichotomy theorem 
for the class of all MiN Sat(S') problems exists. Note that if S' is a 0-valid set of logical 
relations, then MlN Sat(S) is a trivial decision problem. Moreover, Cadoli showed that 
if S is a Schaefer set, then MiN Sat(S) is solvable in polynomial time. To see this, let 
(/? be a CNF (S)-formula and a be a fc-tuple in {0, 1}* that satisfies p. Assume, without 
loss of generality, that for some 1,1 < I < k-\-l the components aj for 1 < j < i are all 
equal to 0 and the rest are all all equal to 1. For each i such that I < i < k,let pi be the 
formula in CNFc(S) obtained from p by substituting the variables x\,. . . , xi-i and 
the variable Xi with 0 . It is easy to see that p has a satisfying truth assignment strictly 
less than a if and only if at least one of the formulas pi for I < i < k is satisfied. 
Therefore Min Sat(S') is polynomially reducible to Satc(5'); thus, if S is Schaefer, 
Min Sat(5') is polynomially solvable. Actually, this argument also shows that if S is 
Schaefer, then Min Satc(S') is solvable in polynomial time. On the intractability side, 
however, Cadoli [1,2] showed that there is a 1-valid set of logical relations such that 
Min Sat(S') is NP-complete. Consequently, any dichotomy theorem for Min Sat(S') 
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will be substantially different from Sehaefer’s diehotomy theorem for Sat(S'). Further- 
more, sueh a diehotomy theorem should also yield a diehotomy theorem for the speeial 
ease of Min Sat(S') in whieh S is restrieted to be 1-valid. In what follows, we first 
establish a diehotomy theorem for this special case of MlN Sat(S') and then use it to 
derive the desired dichotomy theorem for Min Sat( 5'), where S is an arbitrary finite 
set of logical relations. 

Theorem 7. [Dichotomy of Min Sat(S') for 1-Valid 5] Let S bea 1-valid set of logical 
relations. 

If S is 0-valid or Schaefer, then MlN Sat(S') is solvable in polynomial time; other- 
wise, it is NP-complete. 

If S is Schaefer, then MlN SaTc(«S') is solvable in polynomial time; otherwise, it is 
NP-complete. 

Proof: Let S' be a 1-valid set of logical relations. In view of the remarks preced- 
ing the statement of the theorem, it remains to establish the intractable cases of the 
two dichotomies. The proof involves three main steps; the first step uses Schaefer’s 
Expressibility Theorem 5, whereas the second step requires the development of addi- 
tional technical machinery concerning the expressibility of the binary logical relation 
{(0, 0), (0, 1), (1, 1)}, which represents the implication connective 

Step 1: If S is 1-valid and non-Schaefer, then SAT(i?i/ 3 ) is log-space reducible to 
1-Min SaTi(S' U {^}). Consequently, 1-Min SaTiIs' U {^}) is NP-complete. 

Step 2: If S is 1-valid and non-Schaefer, then 1-Min SaTi(S' U {-^}) is log-space 
reducible to Min Satc(5'). Consequently, Min Satc(S') is NP-complete. 

Step 3: If S is 1-valid but neither 0-valid nor Schaefer, then Min Satc(S') is log-space 
reducible to Min Sat(S'). Consequently, Min Sat(S') is NP-complete. 

Proof of Step 1: Assuming that S is 1-valid and non- Schaefer, we will exhibit a log- 
space reduction of SAT(i?i/ 3 ) to 1-Min SaTi(S' U {^}). According to Definition 6, 
the latter problem asks: given a CNFi(S' U {^})-formula, is it satisfied by a truth 
assignment that is different from the all-ones truth assignment (1, . . . , 1)? 

Let (p{x) be a given CNF ({i?i/ 3 }) -formula, where x = (xi, . . . , x„) is the list 
of its variables. By applying Schaefer’s Expressibility Theorem 5 to the occurrences 
of Ri /3 in (p{x), we can construct in log-space a CNF(S')-formula x(x,z,wo,wi), 
such that (/?(x) = Bzx(x,z,0/wo,l/wi), where z = (zi, . . . , Zm),wo,wi are new 
variables different from x (substitutions of different variables by the same constant can 
be easily consolidated to substitutions of the occurrences of a single variable by that 
constant). Notice that the formula x(x, z, wo,l/wi), whose variables are x, z, and wq, 
is a CNF 1 (S') -formula, since it is obtained from a CNF (S) -formula by substitutions by 
1 only. Let ^(x, z, wg) be the following formula: 



x(x,z,wo,l/wi) A 





A 



m \ 

/\(wo Zj) . 
. 1=1 



It is clear that 'ip{x, z, wg) is a CNFi(S U {^})-formula (hence, 1-valid, because S is 
1-valid) and that i^(x) = 3zx{x,z,0/wg,l/wi) =3z'f{x,z,0/wg). 




The Complexity of Minimal Satisfiability Problems 415 



It is now easy to verify that the given CNF ({i?i/3}) -formula ip{x) is satisfiable if 
and only if the CNF i (S' U {— >})-formula z, wq) is satisfied by a truth assignment 

different from the all-ones assignment (1, . . . , 1). This completes the proof of Step 1. 1 

To motivate the proof of Step 2, let us consider the combined effect of Steps 1 and 
2. Once both these steps have been established, it will follow that Sat({_Ri/ 3 }) is 
log-space reducible to Min Satc(S), which means that an NP-complete satisfiabil- 
ity problem will have been reduced to a minimal satisfiability problem. Note that the 
only information we have about S is that it is a 1-valid, non-Schaefer set of logical re- 
lations. Therefore, it is natural to try to use Schaefer’s Expressibility Theorem 5 in the 
desired reduction, since it tells us that Ri/^ is definable from some CNF c (S') -formula 
using existential quantification. The presence of existential quantifiers, however, in- 
troduces a new difficulty in our context, because this way we reduce the satisfiability 
of a CNF ({S1/3}) -formula (p{x) to the minimal satisfiability of a CNF c(S) -formula 
'ijj{x,z), where z are additional variables. It is the presence of these additional variables 
that creates a serious difficulty for minimal satisfiability, unlike the case of satisfiabil- 
ity in Schaefer’s Dichotomy Theorem 4. Specifically, it is conceivable that, while we 
toil to preserve the minimality of truth assignments to the variables x, the witnesses to 
the existentially quantified variables z may very well destroy the minimality of truth 
assignments to the entire list of variable x, z. Note that this difficulty was bypassed in 
Step 1 by augmenting S with the implication connective which made it possible to 
produce formulas in which we control the witnesses to the variables The proof of 
Step 2, however, hinges on the following crucial technical result that provides precise 
information about the definability of the implication connective — > from an arbitrary 
1-valid, non-Schaefer set S of logical relations. 

Key Lemma 8. Let S be a 1-valid, non-Schaefer set of logical relations. Then at least 
one of the following two statements is true about the implication connective. 

1. There exists a CISV c{S) -formula e{x, y) such that {x ^ y) = e{x, y). 

2. There exists in CN¥ c{S)-formula rj{x, y, z) such that 

(i) {x ^ y) = {3z)ri{x, y, z); (ii) r]{x, y, z) is satisfied by the truth assignment 

( 1 , 1 , 1 ); 

(Hi) if a truth assignment (1, 1, h) satisfies t]{x, y, z), then 6=1. 

In other words, the formula {3z)r](^x, y, z) is logically equivalent to (x y) and 
has the additional property that 1 is the only witness for the variable z under the 
truth assignment (1, 1) to the variables (x, y). 

The proof of the above Key Lemma 8 and the formal proofs of Steps 1 and 2 can be 
found in the full version of this paper [15]. This concludes the proof of Theorem 7. | 

The following three examples illustrate the preceding Theorem 7. 

Example 9. Consider the ternary logical relation K = {(1, 1, 1), (0, 1,0), (0,0, 1)}. 
Since K is 1-valid, the satisfiability problem SAT({iT}) is trivial (the answer is al- 
ways “yes”). In contrast. Theorem 7 implies that the minimal satisfiability problems 
Min SAT({Ff}) and Min SATc({Ff }) are NP-complete. Indeed, it is obvious that K is 
not 0-valid. Moreover, using the criteria mentioned after Definition 3, it is easy to verify 
that K is neither bijunctive, nor Horn, nor dual Horn, nor affine (for instance, K is not 
Horn because (0, 1, 0) A (0, 0, 1) = (0, 0, 0) ^ K). 
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Note that the logical relation K can also be used to illustrate the Key Lemma 8. 
Specifically, it is clear that (x^y) is logically equivalent to the formula {3z)K{x, y, z); 
moreover, 1 is the only witness for the variable 2 : such that {3z)K{l, 1, z) holds. As a 
matter of fact, it was this particular property of K that inspired us to conceive of the 
Key Lemma 8. | 



Example 10. Consider the 1-valid set S = where Rq = {0,1}^ — 

{(0, 0, 0)} (expressing the clause {xV yV z)), R\ = {0, 1}^ — {(1, 0, 0)} (expressing 
the clause {-^x V j/ V z)), R 2 = {0, 1}^ — {(1, 1, 0)} (expressing the clause {^x V 
^y V z)). Since S' is a 1-valid set, Sat(S) is trivial. In contrast. Theorem 7 implies that 
Min Sat(S) and Min Satc(S) are NP-complete, since it is not hard to verify that S 
is neither 0-valid nor Schaefer. | 

Theorem 7 yields a dichotomy for MiN Sat(S) , where S is a 1-valid set of logical 
relations. We will now use this result to establish a dichotomy for MlN Sat(S), where 
S is an arbitrary set of logical relations. Before doing so, however, we need to introduce 
the following crucial concept. 

Definition 11. Let i? be a fc-ary logical relation and R' a relation symbol to be inter- 
preted as R. We say that a logical relation T is a 0-section of R if R can be defined 
from the formula R'{xi , . . . , Xk) by replacing some (possibly none), but not all, of the 
variables cci , . . . , by 0. | 

To illustrate this concept, observe that the 1-valid logical relation {(1)} is a 0-section of 
Ri /3 = {(1) O 7 0)7 (O 7 17 0)7 (O 7 O 7 1)}7 since it is definable by 0, 0). Note that 

the logical relation {(1, 0), (0, 1)} is also a 0-section of R 1 / 3 , since it is definable by 
the formula Ry^{0, X 2 , X 3 ), but it is not 1-valid. In fact, it is easy to verify that {(1)} 
is the only 0-section of i?i /3 that is 1-vaIid. 

Theorem 12. [Dichotomy of Min Sat(5')] Let S be a set of logical relations and let 
S* be the set of all logical relations P such that P is both 1-valid and a 0-section of 
some relation in S. 

If S* is 0-valid or Schaefer, then MlN Sat(S') is solvable in polynomial time; oth- 
erwise, it is NP-complete. 

If S* is Schaefer, then MlN SaTc(»S') is solvable in polynomial time; otherwise, it 
is NP-complete. 

Moreover, each of these two dichotomies can be decided in polynomial time; that 
is to say, there is a polynomial-time algorithm to decide whether, given a finite set S 
of logical relations, MlN Sat(S') is solvable in polynomial time or NP-complete (and 
similarly for Myn SaTc(S')). 

A complete proof of the above theorem can be found in the full version of this paper 
[15] (in electronic form). We now present several different examples that illustrate the 
power of Theorem 12. 

Example 13. If m and n are two positive integers with m < n, then Rm/n is the n-ary 
logical relation consisting of all n-tuples that have m ones and n — m zeros. Clearly, 
Rm/n is neither 0-valid nor 1-valid. Moreover, it is not hard to verify that Rm/n is not 
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Schaefer. Let S' be a set of logical relations each of which is a relation Rmjn for some 
m and n with m < n. The preceding remarks and Schaefer’s Dichotomy Theorem 4 
imply that Sat(S) is NP-complete. In contrast, the Dichotomy Theorem 12 implies that 
Min Sat(S) and Min Satc(S) are solvable in polynomial time. Indeed, S* is easily 
seen to be Horn (and, hence, Schaefer), since every relation P in S* is a singleton 
P={( 1 , ...,!)} consisting of the m-ary all-ones tuple for some m. 

This family of examples contains POSITIVE- I-In-3- Sat as the special case where 
S = {P 1 / 3 }; thus. Theorem 12 provides an explanation for the difference in complexity 
between the satisfiability problem and the minimal satisfiability problem for POSITIVE- 
I-In-3-Sat. I 



Example 14. Consider the 3-ary logical relation T = {0, 1}^ — {(0, 0, 0), (1, 1, 1)}. 
In this case, Sat({T}) is the problem Positive-Not-All-Equal-3-Sat: given a 
3CNF-formula ip with clauses of the form (x V j/ V z), is there a truth assignment such 
that in each clause of (p at least one variable is assigned value 1 and at least one variable 
is assigned value 0? This problem is NP-complete. In contrast, the Dichotomy Theorem 
12 easily implies that Min Sat({T}) and Min SATcdP}) are solvable in polynomial 
time. To see this, observe that {T}* = {{(1)}, {(0, 1), (1, 0), (1, 1)}}, where the log- 
ical relation {(1)} is the 0-section of T obtained from T by setting any two variable 
to 0 (for instance, it is definable by the formula T'(x, 0,0)) and the logical relation 
{(0, 1), (1, 0), (1, 1)} is the 0-section of T obtained from T by setting any one variable 
to 0 (for instance, it is definable by the formula T'{x,y,0)). It is clear that each of 
these two logical relations is bijunctive (actually, each is also dual Horn), hence {T}* 
is Schaefer. | 



Example 15. 3-Sat coincides with Sat(S'), where S = {Ro, i?i, i?2, R3} and Rq = 
{0, 1}^ — {(0, 0, 0)} (expressing the clause (x V 1 / V z)), Ri = {0, 1}^ — {(1, 0, 0)} 
(expressing the clause (^x V y V z}), R 2 = {0, 1}^ — {(1, 1, 0)} (expressing the clause 
(^x V 2 ;)), and P 3 = {0,1}^ — {(1,1,1)} (expressing the clause (^x V->z)). 

Since the logical relations Rq, R\, R2 are 1-valid, they are members of S*. It follows 
that S* is not 0-valid, since it contains Rq. Moreover, the logical relation Ri is not Horn, 
it is not bijunctive, and it is not affine, whereas the logical relation R2 is not dual Horn. 
Consequently, S* is not Schaefer. We can now apply Theorem 12 and immediately 
conclude that MiN Sat(S') (i.e., Min 3-Sat) is NP-complete. | 
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Abstract. A set F of Boolean functions is called a pseudorandom function gen- 
erator (PRFG) if communicating with a randomly chosen secret function from 
F cannot be efficiently distinguished from communicating with a truly random 
function. We ask for the minimal hardware complexity of a PRFG. This ques- 
tion is motivated by design aspects of secure secret key cryptosystems. These 
should be efficient in hardware, but often are required to behave like PRFGs. By 
constructing efficient distinguishing schemes we show for a wide range of basic 
nonuniform complexity classes (including TC 2 ), that they do not contain PRFGs. 
On the other hand we show that the PRFG proposed by Naor and Reingold in [24] 
consists of rC' 4 -functions. The question if rC' 3 -functions can form PRFGs re- 
mains as an interesting open problem. We further discuss relations of our results 
to previous work on cryptographic limitations of learning and Natural Proofs. 
Keywords: Cryptography, pseudorandomness. Boolean complexity theory, 
computational distinguishability. 



1 Basic Definitions 

A function generator F is an efficient (i.e., polynomial time) algorithm which for spe- 
cific values of plaintext block length n computes for each plaintext block x G {0, 1}" 
and each key s from a predefined key set C {0, a corresponding ciphertext 
output block y = F„{x, s) G {0, . k{n) and l{n) are called key length and output 
length of F. The efficiency of F implies that k{n) and l{n) are polynomially bounded 
in n. Observe that the encryption mechanism of a secret key block cipher can be thought 
of as a function generator in a straightforward way. Clearly, cryptographic algorithms 
occurring in practice are usually designed for one specific input length n. However, in 
many cases the definition can be generalized to infinitely many values of admissible 
input length n in a more or less natural way. Correspondingly, we consider function 
generators to be sequences F = (F„)„gisr of sets of Boolean functions 

Fn = {fn,s ■■ { 0 , 1 }" { 0 , ses^], 

where, if n is admissible, we define fn,s{x) = Fn{x, s). 

A function generator F is pseudorandom if it is infeasible to distinguish between a 
(pseudorandom) function, which is randomly chosen from F„, n admissible, and a truly 
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random function / G Bn"'\ (For l,n G IN let denote the set of all 2^ functions 
/ : {0, 1}” — > {0, 1} .) In the sequel, we concentrate on functions / : {0, 1}” — > 
{0, 1}^ and define Note that a truly random function in B\^{n) is just a tuple 

of l{n) independent random functions in 

An iJ-oracle chooses randomly, via the uniform distribution on H, a secret function 
h G H and answers membership queries for inputs x G {0, 1}" immediately with h{x). 
A distinguishing algorithm for a function generator F = is a randomized oracle 
Turing machine D which knows the definition of F, which gets an admissible input 
parameter n and which communicates via membership queries with an iJ -oracle, where 
either iJ = (the truly random source) or iT — Fn (the pseudorandom source). 

The aim of D is to find out whether iJ = (in this case, D outputs 0) or iJ = Fn 
(in this case, D outputs 1). Let us denote by Pru{f) the probability that D accepts 
if the unknown oracle function is /. The relevant cost parameters of a distinguishing 
algorithms D are the worst case running time to = tain) and the advantage £d = 
££)(n), which is defined as 

^oin) = |Pr[L> outputs l|iJ = F„] — Pr[D outputs l|iT = P„]| 

The ratio = rD{n) of a distinguishing algorithm D is roin) = toin) ■ 

Observe further that for any function generator F, there are two trivial strategies to 
distinguish it from a truly random source, which achieve ratio 0(|F„| log(|F„|)), the 
trivial upper bound. In both cases the distinguisher fixes a set X of inputs, where |X| is 
the minimal number satisfying > 2|P„|. The first strategy is to fix a function / G 
Fn and to accept if the oracle coincides with f on X. This gives running time 0(|A|) = 
0(log |F„|) and advantage The second strategy is to check via exhaustive 

search whether there is some / G P„ which coincides with the oracle function on X. 
This implies advantage at least i but running time 0{\Fn\ log(|F„|)). 

We call F a pseudorandom function generator (for short: PRFG) if for all distin- 

17 ( 1 ) 

guishing algorithms D for F it holds that G 2" . Observe that this definition is 

similar to that in [7]. The difference is that in [7] only superpolynomiality is required. 

Given a complexity measure M we denote by P{M) the complexity class contain- 
ing all sequences of (multi-output) Boolean functions which have polynomial size rep- 
resentations with respect to M. We say that a function generator F has M-complexity 
bounded by a function c : IN — > IN if for all n and all keys s G Sn it holds that 
M{fn,s) < c(n), and that F belongs to P{M) if the M-complexity of F is bounded 
by some c(n) G We will call a complexity class cryptographically strong if it 

contains a PRFG, and cryptographically weak otherwise. 

It is widely believed that there exist PRFGs (see e.g. section 4), i.e.. P/poly is sup- 
posed to be cryptographically strong. Pseudorandom function generators are of great 
interest in cryptography, e.g. as building blocks for block ciphers [20,21], for remotely 
keyed encryption schemes [22,3], for message authentication [2], and others. As the ex- 
istence of PRFGs obviously implies P ^ NP, recent pseudorandomness proofs refer 
to unproven cryptographic hardness assumptions. Below we search for cryptographical 
strength - or weakness - for most of the basic nonuniform complexity classes. 

A distinguishing algorithm D = D{n, m), depending on input parameters n (input 
length) and m (complexity parameter), is a polynomial distinguishing scheme with 
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respect to M (resp. P(M)) if there are functions t{n, m) £ such 

that for all polynomial bounds m = m{n) G and all (single output) functions 

g G Bn with M{g) < m{n) it holds that D{n, m) runs in time t{n, m) and 
Prnig) - E/gB„Pr£)(/) > s{n,m). 

The definition of a quasipolynomial distinguishing scheme with respect to M can 
be obtained by replacing t(n, to), e“^(n, m) G + hy t{n,m),e~^{n,m) G 

+ We call a distinguishing scheme efficient if it is quasipolynomial 

or polynomial. 

If there is an efficient distinguishing scheme D w.r.t. such a complexity measure M 
then, obviously, P(M) is cryptographically weak as each output bit of a function gen- 
erator in P(M) can be efficiently distinguished via D. Consequently, as the efficiency 
of key length is a central design criterion for modem secret key encryption algorithms, 
these algorithms should have nearly maximal complexity w.r.t. to such complexity mea- 
sures M. As cryptographers are searching for encryption mechanisms having hardware 
implementations which are very efficient with respect to time and energy consump- 
tion, there is a low complexity danger to get into the sphere of influence of one of the 
distinguishing schemes presented in this paper. 

We consider several types of constant depth circuits over unbounded fan-in MODm, 
AND-, OR-, as well as bounded and unbounded weight threshold gates. The gate func- 
tion MODm is defined by MODm(a;i, • • ■ , if and only \f x\ + . . . + Xn ^ 

0 mod TO. Unweighted threshold gates T”^, resp. T”^, are defined by the relations 

,...,Xn) = l Xi + ... + Xn>r 

and . . . , x„) = 1 x\ + . . . + Xn < r. A weighted threshold gate T>^, 

where a G ZZ'^, is defined by the relation 

, . . . , x„) = 1 aixi -F . . . -F a„x„ > r. 

The inputs are the constants 0 and 1 and literals from {xi, . . . , Xn, xi, . . . , x„}. The 
definition of the mode of computation as well as the definition of AND- and OR-gates 
should be known. As usual, by AC^, AC°[to], TC° we denote the complexity classes 
consisting of all problems having polynomial size depth k circuits over AND-,OR-, 
resp. AND-, OR-, MOD^-, resp. unweighted threshold gates. 

We further consider branching programs, alternatively called binary decision dia- 
grams (HDDs). A branching program for a Boolean function / G is a directed 
acyclic graph G = (V, E) with I sources. Each sink is labeled by a Boolean constant 
and each inner node by a Boolean variable. Inner nodes have two outgoing edges, one 
labeled by 0 and the other by 1. Given an input a, the output f{a)j is equal to the 
label of the sink reached by the unique path consistent with a and starting at source 
J, 1<J<1- Relevant restricted types of branching programs are 

- Ordered binary decision diagrams (OBDDs), where each computational path has 
to respect the same variable ordering. An OBDD which respects a fixed variable 
ordering tt is called a tt-OBDD. 

- Read-fc-BDDs, for which on each path each variable is forbidden to occur more 
than k times. 
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2 Related Work, Our Results 

Cryptographic Weakness. In section 3 we present eiScient distinguishing schemes 
for the following complexity measures, 

- a quasipolynomial scheme for the size of read-fc BDDs (Theorem 3), 

- a quasipolynomial scheme for the size of weighted Threshold-MOD 2 circuits, i.e. 
depth 2 circuits with a layer of MOD 2 -gates connected with one output layer con- 
sisting of weighted threshold gates (Theorem 1), 

- a quasipolynomial scheme for the size of constant depth circuits consisting of 
AND-, OR-, and MODp-gates, p prime (Theorem 2), 

- a polynomial scheme for the size of unweighted threshold circuits of depth 2 (The- 
orem 4), 

- a quasipolynomial scheme for the size of constant depth circuits having a constant 
number of layers of AND-, OR-gates connected with one output layer of weighted 
threshold gates (Theorem 5). 

Observe that the function generator , x„) = where ae 

(xi, . . . , Xn) G {0, 1}”, corresponding to the NP-hard Subset Sum Problem, belongs 
to TC 2 [28], which emphasizes the cryptographic weakness of this operation. 

The complexity measures M handled below represent a ’’frontline” in the sense that 
they correspond to the most powerful models for which we know effective lower bound 
arguments, i.e., methods to show II ^ P{M) for some explicitely defined problem 
n. Indeed, all our distinguishing schemes are inspired by the known lower bound ar- 
guments for the corresponding models and can be seen as some ’’algorithmic version” 
of these arguments. It seems that searching for effective lower bound arguments for a 
complexity measure M is the same problem as searching for methods to distinguish un- 
known P(M) -functions from truly random functions. Note that a similar observation, 
but with respect to another mode of distinguishing, was made already by Razborov and 
Rudich in [27]. For illustrating the difference of their approach with our paper let us 
review the results in [27] in some more detail and start with the following definition. 



Distinguishing Schemes versus Natural Proofs. Let F C P/poly denote a com- 
plexity class and T — (Tn) G P be a sequence of Boolean functions for which the 
input length of T„ is N=2". T is called an efficient P -test against a function generator 
F = (Pn)neiN (consisting of single output functions) if for all n 

|Pr/[T„(/) = 1] -Pr,[T„(/„,,) = 1]| >p~\N) 
for a polynomially (in N) bounded function p : IN — > IN. Hereby, functions / G Bn 
are considered to be strings of length iV = 2". The probability on the left side is taken 
w.r.t. the uniform distribution on P„ (the truly random case), the probability on the 
right side is taken w.r.t. the uniform distribution on P„ (the pseudorandom case). The 
following observation was made in [27]. 

(1) It seems that all complexity classes A for which we know a method for proving that 
P ^ A for some explicitely defined problem F have a so called P -Natural Proof 
for some complexity classes P C P/poly, (the somewhat technical definition of 
Natural Proofs is omitted here). 
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(2) On the other hand (and this is the property of Natural Proofs which is important 
in our context), if A has a F -Natural Proof then all function generators F = (F„) 
belonging to A have efficient F -tests. 

The main implication of [27] is that a F/poly-Natural Proof against P/poly would 
imply the nonexistence of function generators which are pseudorandom w.r.t. P/poly- 
tests. But this implies the nonexistence of pseudorandom bit generators [27], contra- 
dicting widely believed cryptographic hardness assumptions. 

In contrast to our concept of pseudorandomness, the existence of an efficient P -test 
for a given PRFG F does not yield any feasible attack against the corresponding cipher, 
because the whole function table has to be processed, which is of exponential size in n. 
Thus, informally speaken, the message of [27] is that effective lower bound arguments 
for M, as a rule, imply low complexity circuits which efficiently distinguish P(M)- 
functions from truly random functions, where the complexity is measured in the size of 
the whole function table. Our message is that effective lower bound arguments for M, 
as a rule, imply even efficient distinguishing attacks against each secret key encryption 
mechanism which belongs to P(M), where the running time is measured in the input 
length of the function. Observe that our most complicated distinguishing scheme for 
the size of constant depth circuits over AND, OR, MODp, p prime, (Theorem 2) uses 
an idea from [27] for constructing an -Natural Proof for AC°[p], p > 2 prime. 

Cryptographic Strength. In section 4 we try to identify the smallest complexity 
classes which are powerful enough to contain PRFGs. In [7], a general method for 
constructing PRFGs on the basis of pseudorandom bit generators is given. The con- 
struction is inherently sequential, and at first glance it seems hopeless to build PRFGs 
with small parallel time complexity. Naor and Reingold [23,24] used a modified con- 
struction, based on concrete number-theoretic assumptions instead of generic pseudo- 
random bit generators. They presented a function generator (which we shortly call NR- 
generator, the definition will be presented in section 4) which is pseudorandom under 
the condition that the Decisional Diffle-Hellman Assumption, a widely believed cryp- 
tographic hardness assumption, is true. Moreover, the NR-generator belongs to TC^, in 
[24] it is claimed (without proof) that it consists of TCg-functions. 

We show in Theorem 6 that the NR-generator even consists of TC'4-functions, i.e. 
TC2 seems to be cryptographic strong while TG® is weak. It is an interesting open 
question if TCg is strong enough to contain PRFGs. 

Learning versus Distinguishing. Clearly, a successful distinguishing attack against 
a secret key encryption algorithm does not automatically imply that relevant informa- 
tion about the secret key can be efficiently computed. Observe that breaking the cipher 
corresponds to efficiently learning an unknown function from a known concept class. 
It is intuitively clear and easy to prove that, with respect to any reasonable model of 
algorithmically learning Boolean concept classes from examples, any efficient learning 
algorithm for functions from a given complexity class A gives an efficient distinguish- 
ing scheme for A. (Use the learning algorithm to compute a low complexity hypothesis 
h of the unknown function / and test if h really approximates /.) But without mak- 
ing membership queries, each efficient distinguishing algorithm (which poses oracle 
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queries only for randomly chosen inputs) can be simulated by an efficient weak learning 
algorithm, computing a ^ + e-approximator for the unknown function [4]. I.e., efficient 
known plaintext distinguishing attacks do clearly break a cipher. There is some evi- 
dence that in the general case, if chosen plaintext, i.e., membership queries are allowed, 
this is not the case. It is not hard to see that there is a polynomial distinguishing scheme 
for polynomial size OBDDs.' On the other hand, there are several results proved in [17] 
which strongly support the following conjecture: it is impossible to efficiently learn the 
optimal variable ordering of a function with small OBDDs from examples. 

The results of this paper can be considered as cryptographic limitations of prov- 
ing lower bounds for complexity classes containing TC^, while the results of [27] can 
be seen as cryptographic limitations of proving lower bounds against P/poly. Cryp- 
tographic limitations of learning were already detected by Kearns and Valiant in [13]: 
efficient leamability of T 6*3 -functions would contradict the existence of pseudorandom 
bit generators in TC^ and thus to widely believed cryptographic hardness assumptions 
like the security of RSA or Rabin’’ s cryptosystem. 

Note that for all complexity classes A which are shown in section 3 to be crypto- 
graphically weak, it is unknown whether A- functions are efficiently leamable. 



3 Distinguishing Schemes 

We start with basis test T(p, <5, N), where G (0, 1), which accepts if 

1 ^ 

— ^ [p-5,p + 5\. 

' i=\ 

The Xi denote X mutually independent random variables defined by Pr[Xi = 1]= p 
and Pr[Xi = 0] = 1 — p. Hbffdings Inequality, see e.g., [1, Appendix A], yields 

Lemma 1. The probability thatT{p, S, N) accepts is smaller than 2e~^^ ^ . □ 

Most of our distinguishing schemes first choose a random seed r from an appropri- 
ate set R, and then perform a corresponding test T(r) on the oracle function. Such a 
scheme is called a (p, q, p)-test for a function f* G if it accepts a random function 
with probability < p (i.e., [T(r) accepts /]] < p), but if the probability 

(taken over r) that T{r) accepts /* G Fn with probability at leastq, is > p. 

Lemma 2. If pq > p then a {p,q, p)-test for f* distinguishes f* with advantage at 
least pq — pfrom a truly random function. □ 



Theorem 1. There is a polynomial distinguishing scheme for polynomial size weighted 
threshold- M OD 2 circuits. 

* Take disjoint random subsets of variables Y and Z of appropriate logarithmic size and test if 
the matrix (/({/, 2 , 0 )), where y and 2 range over all assignments of E and Z, resp., has small 
rank. As in the pseudorandom case with probability 1 /poly in), Y and Z are separated by the 
optimal variable ordering of the oracle function /. This gives an efficient test. 
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Proof. The algorithm is based on a result from Bruck [6]. If m is the minimal number 
of MOD2-nodes in a weighted threshold-MOD2-circuit computing a given / G Bn then 
there is a MOD2-function p{x) = Xi^ ® ... ® Xi^ in such that © 

p{x)] — ^1 > Let us fix a polynomial bound m{n) G Let the scheme D 

work as follows on n and m = m{n). It chooses an approriate number fi, log(m) < 
h < n, chooses a random MOD2 -function p{x) over {xi, . . . ,Xn} and accepts if 

|E^^{0 0 ) ©p(x)] — i| > . Observe that the running time is linear in 

iV = 2" and that this test is a (1/iV, 1, ^)-test on each function f* G Bn 

having weighted threshold-MO_D2 circuits of size m. (Observe the above mentioned 
result [6] and the fact that the subfunction /(•, 0 ) has size < m.) It is easy to see that 
we can find some fi G 0(log(n)) yielding advantage ^ (see Lemma 2). □ 

Theorem 2. For all primes p and all constant depth bounds d there is a quasipoly- 
nomial distinguishing scheme for polynomial size depth d circuits over {AND^ OR, 
MODp}. 

The proof is quite lengthy and can be found in the full paper [14]. As MOD^fc belongs 
to AC2 [p] [29], the proof for prime powers follows immediately. 

Theorem 3. For all k > I there is a quasipolynomial distinguishing scheme for non- 
deterministic read-k BDDs. 

Proof. The first exponential lower bounds on read k branching programs were indepen- 
dently proved in [5] and [26]; see also [12]. We use these methods for our distinguish- 
ing scheme. Let us fix an arbitrary natural constant k> I, and a polynomial bound 
m = m{n) G Let us denote = {x\, . . . ,x„}. Jukna [12] shows the exis- 
tence of a number s G and a constant j G (0,1) such that each / G Bn 

which is computable by a nondeterministic syntactic read-k times branching program 
of size m{n) can be written as / = V^i /i, where for all i, 1 < i < W, it holds that 
there is a partition = UiU Vi U Wi of pairwise disjoint subsets Ui,Vi,Wi of 
such that fi{Xn) = gfUi, V) A hi(Vi,Wi), where \Ui\ > yn and \Wi\ > yn. 

The distinguishing scheme D works on n and m = m{n) as follows. 

(0) Fix an appropriate N G and test via T(i, ]\r] if the probability that the 
oracle function outputs 1 is at least | . If not accept. 

(1) Compute s and parameters q,r G n. Let Q = 2®. Choose randomly disjoint 

subsets U, W from with \U\ = \W\ = q, and a {0, l}-assignment b of X\{UU 
W). Finally, choose random {0, l}-assignments a^,...,a''ofU. 

(2) Accept iff f{a^,b, c) A ... A /(a’’, b,c) = I for at least ^ assignments c of IL. 

The parameters q, N, and r will be specified later. Observe that the running time is 
0{rQ). Observe further that the probability that a truly random function will be ac- 
cepted in Step 2 is bounded by for b = ^ — 2“’’ (see (1)). 

In the pseudorandom case U C Uj and W C Wj holds for some j for which 
Prx[fj{x) = 1] > ^ with probability i(7/2)^«. Further, with probability ^(7/2)^'? 
we have b fixed in such that Pra,c[/j(o, b, c) = 1] > where a and c denote the 
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assignments of U and W respectively. This implies Pra[gj{a,b) = 1] > ^ and 
Prc[hj{b, c) = 1] > Thus, with probability p = ^’'^(7/2)^'^ 9]{a^,b) = ... = 
gj{a^,b) = 1 holds. Under this condition, it holds for all assignments c to LL and 

l, l < I < r, that fj{ai,b, c) = 1 iff hj{b, c) = 1 iff fj{ai, b,c) = I for all /, 1 < i < r. 

As fj{ai,b, c) = 1 implies /(a^, 6 , c) = 1, the function is accepted in Step 2 with prob- 
ability 1. We obtain that Step 1 and 2 form a {p, 1, 26“^*^ for each function / of 

size at most m. It can be easily verified thaf for q = [log 2 (s^n)J and r = [log 2 ( 12 s)J, 
we can find some N £ such thaf D{n, m) achieves advantage e{n, m) fulfilling 
£{n,m)~^ € □ 

Theorem 4. There is a polynomial distinguishing scheme for polynomial size unweight- 
ed depth 2 threshold circuits. 

Proof. For all distributed functions / : {0, 1}” x {0, 1}” — > {0, 1} consider the 
invariants j{f) = max I E,^^y[f{x, y) © g{x) © h{y)] - i| ; g,h£ and 
«(/) = max { Ey[f{x, y) © f{x', y)] - 5 1 ; x x' £ {0, 1 }”} . 

The first exponential lower bound on the size of unweighted depth 2 threshold cir- 
cuits was proved in [10]. The following two observations are implicitly contained there. 
Let us fix an arbitrary polynomial bound m = m{n) £ . 

(I) There is a number S £ such that if / : {0, 1}” x {0, 1}” — > {0, 1} has 

unweighted depth 2 threshold circuits of size m{n) then y{f) > 

(II) For all / : {0,1}” x {0,1}” — > {0, 1} if holds thaf 7 ]/) < (i(a(/) + 2 -”))i/ 2 . 

The distinguishing scheme D = D(n, m) is defined fo do the following on n and 

m. It chooses an appropriate number q £ 0(log(n)) such that for (5 = 2* the condition 
Q > is satisfied, and fwo random assignmenfs a; ^ x' of {xi, . . . , x,}. D accepts if 

\^ye{0^iy[f{x,y,0)®f{x',y, 0 )] - > 25 ^. 

Observe that the probability that this test accepts a truly random function is the same 
as the probability that test T{^, Q) accepts, i.e., at most . On the other 

hand, for all oracle functions of size < m the following holds: if in Step 1 the pair 

X, x' determining a{f{-, ■, 0 )) is chosen (and this occurs with probability 1/{Q{Q — 
1))) then Step 2 will accept with probability 1. In ofher words, we have a {1/{Q{Q — 
1)), l, 2 e“‘ 3 /‘S' )-test. If is quite easy to verify that we can fix some q £ 0(log(n)) 
which gives advanfage e{n, m) for D{n, m) fulfilling thaf £“^(n, m) £ □ 

Theorem 5. For all k > I it holds that there is a distinguishing algorithm of quasipoly- 
nomially bounded ratio for depth fc + 1 circuits consisting ofk levels of AND and OR 
gates connected with one weighted threshold gate as output gate. 

The proof exhibifs the “Switching Lemma” [11] and can be found in the full paper [14]. 

4 Pseudorandom TC° -Functions 

The NR-generator F is defined as follows. For all n the keys s for F have the form 
s = {P,Q, g,r,ai, . . . , an), where P and Q are primes, Q divides P —1, g £ Z*p has 




On the Minimal Hardware Complexity of Pseudorandom Function Generators 



427 



multiplicative order Q, and oi, . . . , a„ are from 2Z*q. Define the corresponding function 
/, :{ 0 ,l}"^^pC{ 0 ,l}"by 

fs{x) = fs{xi,. ..,Xn) = mod P, 

where y{x) = Y\a=i ®^r purpose it is obviously sufficient to show 

Theorem 6 . The function f = fs has polynomial size depth 4 unweighted threshold 
circuits. 

Proof. We use the following terminology and facts about threshold circuits which are 
mainly based on results from [8,9,28]. 

Definition 1. A Boolean function g : {0, 1}" — > {0, 1} is called t-bounded if there 
are integer weights mi , and t pairwise disjoint intervals [ak, bk], ^ k < t of 
the real line such that {g{x\, ... ,Xn) = 1 3k s.t. '^^^^WiXi € [ak,bk]); g 
is called polynomially bounded if g is t-bounded for some t € A multi-output 

function is called t-bounded if each output bit is a t-bounded Boolean function. 

Fact 1: Suppose that a function / : {0,1}” — > {0,1}” can be computed by a depth d 
circuit of polynomial size, where each gate of the circuit performs a function which 
can be written as a sum of at most s £ polynomially bounded operations. 

Then / can be computed by a polynomial size depth d + 1 unbounded weight 
threshold circuit. 

Observe the following statements which can be easily proved. 

Fact 2: If g{xi, . . . , x„) depends only on a linear combination '^iXi, where for 
alH, 1 < i < n, it holds |mi| G then ^ is a polynomially bounded operation. 

Fact 3: If a Boolean function g : {0, 1}" — > {0, 1} can be written as g = h{gi, . . . , 
<7c), where c is a constant and the Boolean functions (?i, ..., 5c : {0,1}" — > {0,1} 

are polynomially bounded operations, then 5 is a polynomially bounded operation. 

As for many other efficient threshold circuit constructions, the key idea is to parallelize 
the computation of f{x) via Chinese remaindering. Let us fix the first r prime numbers 
Pi,. . . ,Pr, where r is the smallest number such that 77 := rii<fc<rf'fe — Y\a=i 
Observe that r £ O(n^) and that all Pi, 1 < i < r, are polynomially bounded in n, i.e., 
can be written as m-bit numbers for some m £ 0 (log n). 

Consider the inverse Chinese remaindering transformation CRT~^ which assigns 
to each r-tupel of m bit numbers (z^, . . . , z”), z* = . ■ . ,Zg) for i = 1 , . . . , r, 

the uniquely defined number 5 < 77 for which y = z'^ mod for alH = 1, . . . , r. De- 
note by CRTf^ the function CRTf^ : ({ 0 , 1 }’")’^ — > { 0 , 1 }" defined as 
(CRT~^{z ^, . . . , z") mod P), and observe 

Fact 4: CRTf^ can be written as the sum of polynomially (in n) many polynomially 
bounded operations. 
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The proof (see, e.g., [28]) is based on the fact that 

CRT-\z\ ...,z^) = ELi E,z^ mod 77, 
where for t = 1 . . . r the number Ei denotes the uniquely determined number smaller 
than 77 for which {Ei mod pj) = Sij for alH, j = 1, . . . , r. This implies 

CRT-^z\ ...,zn = El=i (E7 =o' mod 77 
= EUE;r^'e,,zimod7T. 

for Cij = {Ei2^ mod 77). 

The computation of f{x) will be performed on 3 consecutive levels consisting of 
operations which are polynomially bounded (level 1,2) or which can written as polyno- 
mial length sums of polynomially bounded operations. 

Level 1: Compute z{x) = (z^{x ), . . . , z^{x)), where for alH = 1, . . . , r, the m-bit 
number z* is defined to be {y{x) mod pi). 

Observe that for alH = 1, . . . , r, z*(x) can be written as 



mod Pi = a 






7=1 



mod Pi, 



where Ui denotes a fixed element of order p* — 1 in and r® denotes for j = 1 , . . . , n 
the discrete logarithm of aj to the base ai. Because all r® are polynomially bounded in 
n, it follows by Fact 2 that z(x) is a polynomially bounded operation. 

For all inputs z = (z^ , . . . , z’') G ({0, 1}”®)®’ denote by Y (z) the number 

y(^) = EliEr=~oel4- 

Observe that for all x it holds that y(x) = Y(z(x)) mod 77 and Y(z(x)) < mrU. 
Moreover, there exists exactly one k,l < k < mr— 1, such that y{x) = Y {z{x)) — fc77. 
This fc is characterized by fc77 < Y{z{x)) < (fc-|-l)77—l. Hence, / = fo + - ■ ■+fmr-i 
holds, where for each fc = 0, . . . , mr — 1, the function fk is defined as 
fk{x) = Xk{z{x)){g^^^^^^y^^ mod P), 

where Xfc(^(a;)) € {0, 1} is defined by Xfc(^(a^)) = 1 iffA:77 < Y{z{x)) < {k+l)U — 

1. 

Further observe that mod P = Gk{z) mod P, where Gfc(z) = CkYYi^i 

the Cfc and bij are n-bit numbers defined by Ck = {g~^^ mod P) 
and hij = {g^^j mod P). In contrast to g^PY^n ^ number Gk{z) has polynomi- 
ally many bits, namely n{mr -F 1). Fix u to be the smallest number with YiEi Pi — 
2"(®"’’+i). By the same arguments as above (Level 1), the operation {Gk{z) mod pi) 
is for alH = 1, . . . , t6 polynomially bounded. 

Level 2: For all fc = 0 . . . mr — 1 and i = 1 . . .u compute 

Hl{z) = Xk{z){Gk{z) mod p*). 

This is a polynomially bounded operation as each output bit depends only on two poly- 
nomially bounded operations (Fact 3). 

Level 3: Compute fk{x) = GRTp^{Hl{z{x)), . . . ,H]^{z{x))). 



Theorem 6 follows from Fact 4 and Fact 1. 



□ 
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5 Open Problems 

We could like to detect for each basic nonuniform complexity class A = P(M) whether 
it has an efficient distinguishing scheme (then cryptodesigners should obey the low 
complexity danger w.r.t. M) or whether A contains a PRFG (then lower bound proofs 
for this model seem to be a very serious task). Alas, for classes like TC® and AC^[m], 
m composite, this is still unknown. Is TC^ is strong enough to contain PRFGs? Observe 
that TC 3 seems to contain pseudorandom bit generators. Operations such as squaring 
modulo the product of two unknown primes is in TC^ [28]. 

Another open problem is the design of an efficient distinguishing scheme for poly- 
nomial size weighted threshold-MODp circuits, p an odd prime power. This is the only 
example of a complexity measure for which we failed to transform the known effective 
lower bound method (see [15]) into a distinguishing algorithm. 

Also, we would like to determine the minimal hardware complexity of other crypto- 
graphic primitives like pseudorandom bit generators, pseudorandom permutation gener- 
ators, one-way functions and cryptographically secure hash functions. Does TC 2 con- 
tain pseudorandom bit generators? Luby and Rackoff [20] presented a construction for 
pseudorandom permutations by three sequential applications of a pseudorandom func- 
tion, each followed by an XOR-operation. They also showed how to construct super 
pseudorandom permutations by four such applications. Thus, as a corollary of our re- 
sults, efficient pseudorandom permutations can be constructed in TCJq and efficient su- 
per pseudorandom permutations can be constructed in TC 53 . We conjecture that these 
results can be further improved, perhaps based on the results from [25]. 
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Abstract. We study some versions of the problem of finding the minimum size 
2-connected subgraph. This problem is NP-hard (even on cubic planar graphs) 
and MAX SNP-hard. We show that the minimum 2-edge connected subgraph 
problem can be approximated to within | — e for general graphs, improving upon 
the recent result of Vempala and Vetta [14], Better approximations are obtained 
for planar graphs and for cubic graphs. We also consider the generalization, where 
requirements of 1 or 2 edge or vertex disjoint paths are specified between every 
pair of vertices, and the aim is to find a minimum subgraph satisfying these re- 
quirements. We show that this problem can he approximated within |, general- 
izing earlier results for 2-connectivity. We also analyze the classical local opti- 
mization heuristics. For cubic graphs, our results imply a new upper bound on the 
integrality gap of the linear programming formulation for the 2-edge connectivity 
problem. 



1 Introduction 

Graph connectivity is an important topic in theory and practice. It finds applications in 
the design of computer and telecommunication networks, and in the design of trans- 
portation systems. Networks with certain level of connectivity, which intuitively means 
that they provide certain number of connections between sites, are able to maintain re- 
liable communication between sites, even when some of the network elements fail. For 
a survey and further applications, see Grdtschel et al. [9]. 

Problem Statement. Given a graph with weights on its edges, and an integral connec- 
tivity requirement function for each pair of vertices u and v, the vertex connectivity 
{edge connectivity, respectively) survivable network design problem (SNDP) is to find 
a minimum weight subgraph containing at least vertex (edge, respectively) disjoint 
paths between each pair u, v of vertices. If r^v G X for some set X, for each pair u, v, 
we denote the problem as AT-VC-SNDP (X-EC-SNDP, respectively). The term surviv- 
able refers to the fact that the network is tolerant to the failures of sites and links (in case 
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(ALCOM-FT). 

** The author was supported by Deutsche Forschungsgemeinschaft (DFG) Graduate Scholarship. 
Part of the work by this author was done while he was visiting the Combinatorics & Optimiza- 
tion Dept., University of Waterloo, Ontario, Canada, during January-March, 2000, and was 
partially supported by NSERC grant no. OGP0138432 of Joseph Cheriyan. 



A. Ferreira and H. Reichel (Eds.): STAGS 2001, LNCS 2010, pp. 431-442, 2001. 
© Springer- Verlag Berlin Heidelberg 2001 




432 



Piotr Krysta and V.S. Anil Kumar 



of VC-SNDP) or links (for EC-SNDP). Even the simplest versions of these problems 
are NP-hard, and so approximation algorithms^ are of interest. 

Previous Results for General Cases. For A/>o-EC-SNDP with arbitrary edge weights, 
there is a 0(log(rmax))-approximation by Goemans et al. [6] (r^ax = max„^„(r„„)), 
whieh was reeently improved to 2 by Jain [10]. No algorithm with a non-trivial guaran- 
tee is known for the general version of VC-SNDP. For {0, 1, 2}-VC-SNDP with arbi- 
trary edge weights, Ravi and Williamson [13] gave a 3-approximation algorithm. 
Unweighted Low-Connectivity Problems. The ease of low conneetivity requirements 
is of partieular importanee, as in praetiee networks have rather small eonneetivities. 
There has been intense researeh for problems with low eonneetivity requirements and 
all the edge weights being equal to one {unweighted problems) [2,5,11,14]. We foeus 
on the speeial unweighted eases of this problem where eaeh G {1,2}. These are the 
simplest non-trivial versions of this problem and have been studied for a long time, but 
tight approximation guarantees and inapproximability results are not fully understood. 

For the unweighted {2}-EC-SNDP (or 2-EC) Khuller and Vishkin [11] gave a |- 
approximation, whieh was improved by Cheriyan et al. [2] to j|, and reeently to | 
by Vempala and Vetta [14]. For the unweighted {2}-VC-SNDP (or 2-VC) Khuller and 
Vishkin [11] gave an algorithm with approximation guarantee of |, whieh was im- 
proved to I by Garg et al. [5] and to | by Vempala and Vetta [14]. Both unweighted 2- 
VC and 2-EC problems are NP-hard even on eubie planar graphs, and also MAX SNP- 
hard [3]. Forbothunweightedproblems {1, 2, . . . , fc}-VC-SNDPand {1, 2, . . . , fc}-EC- 
SNDP, the results of Nagamoehi and Ibaraki [12] imply fc-approximation algorithms. 

The Linear Programming (LP) relaxation for the 2-EC problem and the subtour 
relaxation for TSP are very elosely related [2]. The approximation ratio of | obtained by 
Vempala and Vetta [ 1 4] has a speeial signifieanee, beeause of the eonneetions with the | 
eonjeeture for metrie TSP [1,2]. In this regard, the issue of whether | ean be improved 
for 2-EC is an interesting question. Also, the integrality gap^ of the LP relaxation for 
2-EC is not well understood. Vempala and Vetta [14] say that their result does not imply 
the same bound on the integrality gap for 2-EC. We prove a bound of better than | on 
the integrality gap for eubie graphs, i.e. graphs with maximum degree at most 3. 

Little is known about vertex-eonneetivity generalizations where arbitrary require- 
ments are allowed, even for unweighted graphs. The simplest sueh generalization is to 
allow requirements of 1 or 2, instead of 2 for every pair. It should be noted that if the 
requirement ean also take value of zero, the unweighted and weighted problems are es- 
sentially identieal: an edge with an integer weight w ean be replaeed by a path of Steiner 
vertiees of length w. For instanee, unweighted {0, 1, 2}-VC-SNDP is equivalent to the 
weighted {0, 1, 2}-VC-SNDP eonsidered by Ravi and Williamson [13]. 

Our Contributions. We give improved approximation algorithms for the 2-edge eon- 
neetivity (2-EC) and the {1, 2}-eonneetivity problems. We show a (|—e) -approximation 

* A polynomial time algorithm is called an a-approximation algorithm, or is said to achieve 
an approximation (or performance) guarantee of a, if it finds a solution of weight at most a 
times the weight of an optimal solution, a is also called an approximation ratio (factor). 

^ The LP for the unweighted 2-EC is Xe : X)ee«(S) — 2, VS" C V, S 0;Xe > 

0, Ve e E}, where S{S) is the set of edges with exactly one end vertex in S. The ratio of the 
optimum integral solution value of 2-EC to the value of the LP, is called the integrality gap. 
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algorithm for 2-EC on general graphs, where e = Our algorithm extends the tech- 

nique of Vempala and Vetta [14] and removes the bottlenecks mentioned in their paper. 
The main new ideas are a better charging scheme, a refined lower bound, and the use 
of a local search heuristic. We show a tight example for their lower bound and their 
analysis, which leads naturally to the lower bound we use. The improved approxima- 
tion is obtained by proving that paths can be connected for better charge by a more 
careful analysis, which is of comparable complexity as theirs, but is more uniform be- 
cause it only deals with paths. Since, it is unlikely that an approximation scheme exists 
for the 2-EC problem (because of the MAX SNP-hardness), finding the best possible 
approximation guarantee for this classical problem is interesting (Section 3). 

We achieve better guarantees for special classes of graphs, on which 2-EC is still 
NP-hard. For planar graphs, we show for 2-EC a ^-approximation, but in quasi -poly- 
nomial time, by computing a stronger lower bound (Section 4). For 2-EC on cubic 
graphs we obtain a -approximation using a simple local search heuristic. This implies 
an integrality gap of at most for the standard LP on cubic graphs (Section 5). 

For the {1, 2}-VC-SNDP and {1, 2}-EC-SNDP (henceforth denoted by {1, 2}-VC 
and {1, 2}-EC), we give a | approximation. This improves on straightforward 2-appro- 
ximation. Our algorithms are generalizations of the algorithms of Garg et al. [5] and of 
Khuller and Vishkin [11]. The lower bounds used in [5,1 1] do not apply to our problems 
and we generalize them appropriately. We also analyze the performance of the classical 
local optimization heuristics (Section 6). Finally, Section 7 has some conclusions. Most 
of the details and proofs are missing in this abstract, and will appear in the full version. 



2 Preliminaries 

Graph Theory. We consider only undirected simple graphs. Given a graph G = (V,E), 
we write V{G) = V (vertices) and E{G) = E (edges). Sets of vertices or contracted 
subgraphs will sometimes be called (super) nodes. Ci denotes a cycle of length 1. A u—v 
path is a path with end vertices u,v.da(v) is the degree of vertex v in G. For definitions 
of the following standard graph theory notions the reader is referred to [7]: cut vertex, 
two vertices separated by a cut vertex, bridge (or cut edge), ear decomposition S = 
{Qo, Qi, ■ ■ ■ , Qk} (Qo is just one vertex), ears (Qds), Gear (an ear with £ edges), 
open/closed ear and ear decomposition. If a graph is 2-vertex(edge)-connected, then 
we write that it is 2-VC(EC). It is well known, that a graph is 2-EC (2-VC) iff it has 
no bridge (cut vertex, resp.), and iff it has an (open, resp.) ear decomposition. Also, an 
(open) ear decomposition can be found in polynomial time. 

Let E be an ear decomposition of a 2-connected graph. We call an ear S G £ of 
length > 2 pendant if none of the internal vertices of S is an end vertex of another ear 
T G E of length > 2. Let £' C f be a subset of ears of the ear decomposition £. We 
say that set E' is terminal in £ if: (1) every ear in E' is a pendant ear of E, (2) for every 
pair of ears S,T G E' there is no edge between an internal vertex of S and an internal 
vertex of T, and (3) every ear in £' is open. 

Given a rooted tree T, [a, h] denotes the a — 6 path in tree T for some two vertices 
a, b such that b is an ancestor of a, and path [a, b] contains both vertices a, b. We define 
[a, b) (and (a, b], resp.) similarly, but the path contains a (b, resp.) and does not b (a. 
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resp.)- If a is a proper descendant of b in T, we say that a is below or lower than b, and b 
is above or higher than a. A vertex in T is also descendant and ancestor of itself opt{G) 
or opt denotes the size of an optimal solution on G to the problem under consideration. 
Preliminary Reductions. Given a graph G = {V, E), the problem of finding a mini- 
mum size subgraph G' of G in which every vertex of V has degree at least 2, is called 
D2. We refer to G' as D2, or D2 solution on G. This problem can be solved exactly in 
polynomial time [14], and it gives a lower bound for both 2-EC and 2-VC. The graph 
for 2-EC can be assumed to have no pair of adjacent degree 2 vertices (cf [14]), and no 
cut vertices (else, we can solve 2-EC separately on each 2-VC component). 

We define now beta structures [14]. A vertex n is a beta vertex if deleting some two 
adjacent vertices v \ , V 2 leaves at least 3 components, one of which is just m (Fig. 1(a)). 
Two adjacent vertices ui,U 2 are called beta pair if there are two other vertices V\,V 2 , 
whose removal leaves at least 3 components, one of which is just m and U 2 (Fig. 1 (h)). 
Fig. 1(6) shows all the four edges (wi, ■ui), (wi, M 2 ), {v 2 , ui), (v 2 ,U 2 ), but it maybe the 
case for a beta pair that just three of them are present. A graph with no beta vertex or 
beta pair is called beta-free. Vempala and Vetta [14] show that any a-approximation al- 
gorithm for 2-EC on beta-free graphs can be turned into an a-approximation algorithm 
for 2-EC on general graphs. Thus, we can assume for 2-EC that the graph is beta-free. 

Let G = Gi (I > 3) be a given cycle in G. If any solution to 2-EC on G uses at least 
I' edges from the subgraph induced on V (C), then we can contract cycle C (i.e. identify 
the vertices in V (C) and delete self loops) into a super node and solve recursively 2-EC 
on the resulting graph, incurring a factor of ^ as in [14]. The solution to 2-EC on G will 
then be the union of E{G) with the edges of the recursive solution. I' here is used as a 
“local” lower bound. The overall approximation ratio of such an algorithm is > I /I'. 

Let T be a given rooted spanning tree, some of whose vertices might be super nodes 
corresponding to subgraphs of the input graph. For a given non-root super node iV of T 
the tree edge to the parent of N is called the upper tree edge of N . If is a non-leaf 
super node, then any of the tree edges to its children is called the lower tree edge of N. 
Local Optimization Heuristics. We define here a general local optimization heuristic. 
Let 77 be a minimization problem on G = {V^E), where we want to find a spanning 
subgraph of G with minimum number of edges, which is feasible for (or w.r.t.) problem 
77. Given a non-negative integer j and any feasible solution El C G to problem 77, we 
define the j-opt heuristic as the algorithm which repeats, if possible, the following: 

- if there are subsets Eq G E \ E{El),Ei C E{H) (|77o| < j, \Ei\ > |7^o|) such 

that (77 \ Tfi) U 77o is feasible w.r.t. 77, then set El ^ (77 \ Ti^i) U 77q. 

The algorithm outputs 77, if it can perform no such operation on 77 any more. We say 
that such output solution is j-opt (or j -optimal) w.r.t. 77. If |7i^o| = j, then we call 
the operation above a j-opt exchange. The algorithm can be implemented to run in 
polynomial time when j is a fixed constant. 

3 Approximating 2 -Edge Connectivity: General Graphs 

We start by considering the algorithm of Vempala and Vetta [14], which gives a | 
approximation for 2-edge connectivity problem. They use D2 as the lower bound. The 
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beta-free example in Fig. 1(c) shows that their lower bound is tight. Ti, . . . ,Tk are 3- 
cycles connected to vertices v, w of C, and C is a small clique. Clearly, the optimum 
must use 4 edges for each Ti, and asymptotically, \OPT\ = where n is the number 
of vertices. This suggests that we should use a modified lower bound of max(|D2|, 4/), 
where I is the number of such disjoint triangles. 




Fig. 1. (a), (6) Beta-Structures, (c) A Tight Example for the Lower Bound in [14]. 

Our algorithm is an extension and refinement of that of Vempala and Vetta [14]. The 
main differences in our algorithm are: 

- In a necessary preprocessing step, we use a simple local optimization heuristic to 
eliminate some complicated configurations. 

- We observe that paths can be connected up with charge less than | by partly charg- 
ing their child nodes and by using the fact that neither beta-vertices nor cut vertices 
are present. The analysis of [14] mentions that one of the bottlenecks in their | 
approximation is the charge for 2-connecting paths. The reason that they only get a 
charge of | is because they only use the property that cut vertices are missing. 

- We actually obtain a stronger property: paths can not only be connected up with 
charge less than i, they can also pay for an extra with charge remaining less 
than I (in some cases, such paths can only pay for an extra r^). The algorithm 
first computes some DFS spanning tree of the graph, different than the DFS span- 
ning tree in [14]. Then it 2-connects the tree in a top-down manner ([14] does this 
bottom-up). This is critical, since it allows us to charge the added edges to children. 

- Paths whose parent is incident at one end, can be 2-connected with charge better 
than i, even if one free edge is not available, by partly charging the children. 

- Another bottleneck in [14] are 3-cycles and a | charge for these seems inevitable 
in their analysis. We overcome this by either merging 3 -cycles with their children 
to form longer paths, or charge them with their children. For leaf 3-cycles, we use 
a stronger lower bound of max(|i22| , 41), where I is the number of such leaves. 



3.1 The Algorithm 

A description of our algorithm is given below. 

1. Compute D2 solution and partition it into paths and 2-EC components as in [14]. 
See Section 3.2 below for details. 

2. Run a 1-opt heuristic on the D2 solution w.r.t. D2, and replace that current D2 
solution with the 1-opt one. 
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3. Transform each 2-EC component into a cycle of length at least 3 by contracting 
suitable vertices and edges. See Section 3.2 for details. 

4. Shrink all paths in D2 into single nodes. 

5. Perform DFS on the contracted graph with the following rule for deciding the order 
of exploring neighbors on entering a node. 

- On entering a path: Suppose that the parent enters this path node at vertex 

V. Then start DFS from the end of the path that is farther away from v, and 
proceed till the other end. 

- On entering a cycle C = vi, ... ,Vk- Traverse the whole C first in the order 

VI, V 2 , . . .,Vk- 6 , w,7(fc-5), ■ • ■,VTr{k), whcrc 7r(fc - 5), . . . ,7r(fc) is a permuta- 
tion of fc — 5, . . . , fc. The 7T is chosen such that: (i) one can traverse the cycle in 
this order, and (ii) if possible, riTrifc) is a vertex leading to an unvisited vertex. 
That is, of all the possible traversals of the last 6 vertices of the cycle, choose 
one that allows the last vertex to have a child. If the child is another cycle, we 
get a longer path. If no such r'Tr(fc) exists, any permutation is equally good. 

Uncontract the nodes representing paths in D2, and let T be the resulting DFS tree. 

6. Decompose T into maximal paths in the following manner: paths in D2 remain 
paths now, and we call them red paths. All edges of a cycle are part of one path. If 
a collection of cycles or paths can be merged, they will form a new path. Paths cor- 
responding only to cycles are called blue paths, whereas a path formed by merging 
cycles and paths of D2 is a red path. The only difference between these is that a red 
path has two free edges (cf Section 3.2 and [14]). Consider the natural parent child 
relation between these paths. The parent of a blue path is incident to it at one end. 

7. Consider the paths in a top down manner. Each path is connected using extra edges. 
For a blue path, 2-connecting up just means connecting it within, since one end of 
it is part of the parent. 2-connecting paths involves forming blocks (Section 3.3). 

The next subsection describes the way edges are added and charged for each path. 
The charge on each edge of each path other than blue paths of length three at leaves 
is at most Since our lower bound is max(i72, 4/), where I is the number of 

blue paths of length three at leaves, the approximation factor is 1 -F ^ ^ ^ ^ ^ — 
1 < 1 -F a -F = 1 + For a = the factor is at most 

1 -F m < |. This gives us the following theorem. 

Theorem 1. The minimum 2-edge connected subgraph problem on unweighted graphs 
can be approximated within a factor of^ — e, where e = in time 0(n^’®). 

Remark. The running time, as in [14], is the time required to find the D2. If D2 has no 
C3 or C 4 , then our proof shows that the charge on D2 is at most || (cf Section 4). 

3.2 Preprocessing Details 

Partitioning D2. We partition the D2 solution as described in [14], as follows. Partition 
D2 into connected components. Each such component is formed by 2-EC components, 
connected by paths. By an easy reduction, we can assume that any 2-EC component 
with at least 4 D2 edges contains no C3. 
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Let us fix now some connected component C in D2. We will partition any such 
connected component C as follows. Any maximal 2-EC component of C is called a su- 
pernode. Contract all the maximal 2-EC components of C into single nodes. Partition 
the resulting tree (from C) into maximal paths arbitrarily. Then, any of these paths is 
also called a supernode. The key point here is that any such path supemode has two 
associated “free” edges - the first and last edge of the path, which we call free edges. 

Contracting into Cycles. We contract parts of each 2-EC component and solve the 
problem on the resulting graph. The following lemma describes properties of the re- 
sulting 2-EC components. The local search in step 2 of our algorithm is used in its 
proof 

Lemma 1. The contracted final subgraph resulting from a 2-EC component is a simple 
cycle of length at least three. Moreover, each contracted super node contains inside at 
least four edges of the D2 solution. 

3.3 2-Connecting Paths 

The paths are 2-connected by adding extra edges. A path is partitioned into blocks, and 
one extra edge is used for each block, and the cost of this edge is charged to the vertices 
of the block. This is sufficient if the blocks have length at least 4. For smaller blocks we 
charge the children also. At a high level, 2-connecting a path node N is as follows. We 
process the path in the direction opposite to the direction of DFS (the matter of direction 
is critical). Assume that vertices up to uq G N have been 2-connected already, and the 
subsequent vertices are wi, W 2 , . . . , Ufc, with Uk being the last vertex of N. Contract all 
the nodes not in the subtree rooted at N, and the vertices of N up to uq into uq. Now 
consider the farthest going edge e from uq. We have the following cases. 

1. e is incident on Ui,i > 4: The collection of vertices from ui to the other end of e 
forms a block of size at least 4, and we add e to 2-connect this block. 

2. e is incident on u^: If fc = 3, this is the last block of N and is connected differently 
(see below). Else, there must be an edge / from m or U 2 beyond u^, to prevent it 
from being a cut vertex. If / is from U 2 to some vertex beyond us, add e, / and 
delete (u 2 , us). The vertices from ui till the other end of / form a block of size 
at least 4, and a total of 1 edge is used for this block. Now let / be from ui to a 
vertex beyond u^, and let there be no edge from U 2 to uq, or beyond u^. Then U 2 is 
a beta-vertex, and there must be a child node C from U 2 . Vertices ui,U 2 , form a 
block of length 3, and we add e, and partly charge C, to reduce the charge. 

3. e is incident on U 2 '. If e goes to an ancestor, and the upper tree edge is incident 
on U 2 , N has to be a red path and we form a block of length 2 (vertices ui,U 2 ), 
which is connected by adding the upper tree edge and e. If fc = 2, this is the last 
block, which is handled later. Otherwise, there must be an edge / from m, beyond 
U 2 (else U 2 is a cut vertex). If / goes beyond u^, form a block of length at least 4 
with the vertices from u\ to the other end of /, and add e, / and delete {u\,U 2 ). If 
/ = (wi , uf), there must be an edge /' from U 2 , beyond u^. The vertices from u\ to 
the other end of/' form a block, and we add e, /, /' and delete (ui, U 2 ), (w 2 , uz). 
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4. e is incident on mi : If fc = 1, this is the last block, which is handled later. Otherwise 
this is possible only if iV is a red path, and e goes to an ancestor. In this case, we 
use part of the free edges available for red paths. The block will be of length 1 , and 
gets connected by installing the upper tree edge and an upper back edge. 

Remark. The above cases have been stated in a highly oversimplified manner. The argu- 
ments about cut vertices above might not hold if some vertex is a contracted supemode. 
But in this case, we can charge the D2 edges inside this supemode, using Lemma 1. For 
each extra edge to pay we have at least 4 D2 edges available, and < f if f > j- 
Last Block. The description above assumes that the other end of e is not the last vertex; 
the blocks formed above are called intermediate blocks. 2 -connecting the last block 
poses problems if its size is at most 3. In the case of red paths, we use the fact that 
two free edges are available for connecting the path. We use up | of these while 2- 
connecting the intermediate blocks, and ^ is available while dealing with the last block. 
For the last block of a blue path, we do not have any free edge available and the algo- 
rithm adds edges based on the last and the second last block together. 

The Charging Strategy. As mentioned before, the edges of all the paths, and two extra 
(free) edges per each red path are part of the lower bound {D2). We add extra edges to 
2-connect each path, and the cost of these edges is charged to the edges of the paths. For 
each block, the cost of adding an edge to connect it is charged to the edges in the block. 
This works only for blocks of length at least 4, since we want to beat the | barrier. 
Therefore, we will partly charge child nodes of the current node as well, to reduce the 
charge on the current node. The way the child node is charged depends on whether it is 
red or blue. This is due to the difference in the stmcture of red and blue paths: a blue 
path is connected to its parent at one end, while a red path need not be. For a red path, 
we shall make sure that it can not only get 2 -connected for charge less than but can 
also pay for some extra charge, which will be used by the parent node. 

Claim. Any red path can be 2-connected within and upwards, and also pay for an extra 
while incurring a charge less than ^ < 5 - A red path, such that no back edge from 
a descendant has already been installed, can pay for an extra 

The extra charge for the red path is achieved by making each block pay an extra 
i. Finally, we need to prove that the algorithm produces a feasible solution. The path 
nodes are connected top-down, and each time a path is processed, it is 2 -connected 
to its parent. Thus, if the edges added by the ancestors are not altered, the solution 
would indeed be 2-connected. This is tme, because in most cases (except possibly when 
considering a red path with 2 internal edges), the upper tree edge from each of the paths 
is retained. Even in the case of a red path with 2 internal edges, the upper tree edge is 
not retained only if no back edge from any descendant has been installed. Thus, an ear 
decomposition of the solution edges can be done, which is a proof of its feasibility. 



4 The Algorithm for Planar Graphs 

Let G = {V,E) he a given planar graph with \V\ = n. As before, assume that G is 
2-VC and has no beta-stmctures, as in [14]. We show how to compute the minimum 
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subgraph CD2 which is connected and in which each vertex has degree at least two. 
This is a stronger lower bound than D2. We actually compute a (1 + e) approximation 
to CD2, worsening our overall approximation by an extra (1 -f e) multiplicative factor. 
By a simple reduction from the Hamilton cycle problem, the CD2 problem is NP-hard, 
even on cubic planar graphs. Modifying the approach of Grigni et al. [8], we can obtain: 

Lemma 2. There is a (1 + e) -approximation algorithm for the CD2 problem on planar 
graphs (for any £ > Oj, with running time of \ 

If n > 5, one can show that any CD2 solution can be transformed in polynomial 
time to another CD2 solution of no greater size, and without any C3 or C4. Thus, the 
analysis in Section 3 gives a better ratio, since all blue paths have length at least 5 now. 
This gives us the following. 

Theorem 2. For any given e > 0, a ( || -f e) -approximate solution to the minimum size 
2-EC problem on a planar graph can be found in time \ 

Remark. A more careful analysis yields a -t- e) -approximation for planar graphs 
with the same running time. 

5 The Algorithm for Cubic Graphs 

In this section we give a local search based approximation algorithm for the 2-EC prob- 
lem on cubic graphs. Let G = {V, E) be a given 2-VC cubic graph, with \V\ = n. Our 
algorithm has two steps. The first step involves computing an ear decomposition H of 
G with the minimum number f of even ears (i.e. of even length), using the algorithm of 
Frank [4,2] (delete all l-ears, since they are redundant). It is easy to see, that n-\- f — 1 
is a lower bound on the size of the optimum 2-EC solution in G [2]. 

Let a j-opt exchange that does not increase the number of even ears in H be called a 
parity-preserving j-opt exchange. In the 2nd step, run parity-preserving 1-opt heuristic 
on H w.r.t. 2-EC. Let H' be the resulting ear decomposition. Given an ear S in an ear 
decomposition E, we say that an internal vertex u in S' is free if ds{v) = 2. 

Claim. H' is an open ear decomposition with all 2- and 3-ears being terminal. Any 
5-ear in H' has at least two free vertices. 

Let Pi be the total number of internal vertices in the f-ears of H' . Then, pi/{H — 
1) is the number of Gears. We can estimate the size of the solution as: \E{H')\ < 
X)i=2 ~ Si=2 Pi) - summation in the right denotes the number of 

edges in f-ears for f = 2, 3, . . . , 8. This gives: \E{H')\ < (|n -F |p2 + + ^P&+ 

ieP8)+ (|P3 + Ip 5 + ^Pi)- 

Since we have used only parity preserving exchanges, the bound of f still applies 
to the number of even ears: n -F p2 + 5P4 + \pq + yPs — 1 <n-F<j — 1, thus 
|(n -F P2 + \pa + \p& + jPs) < |(^^ + - 1) + I < fopt + |. So the first term in 

the brackets in \E{H') \ can be upper bounded by |opf -F |. 

By the claim above, each 3-ear (5-ear, 7-ear, resp.) has at least 4 (4, 2, resp.) vertices 
associated with it (2, (2, 0,resp.) free vertices, and the two end vertices). Therefore, we 
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can lower bound the number of vertices as follows: 4^ + 4^ + 2^ < n, which gives 
us that |p 3 + < ^n. This we use to upper bound the second term in the 

brackets in Finally, since n < opt, \E{H')\ < (|| + e)opt, for any e > 0. 

Theorem 3. There is a polynomial time local search ( jg + e) -approximation algorithm 
(for any e > Q) for 2-EC on cubic graphs. 

Application: Integrality Gap of the LP. Let LP be the value of the standard linear 
program for 2-EC. Obviously, n < LP. [2] shows that n -F </> — 1 < LP. Thus, we 
obtain that the integrality gap of the LP is at most < | on unweighted cubic graphs. 

6 Approximating {1,2}- Connectivity 

Due to the lack of space, we will make here a simplifying assumption that G is 2-vertex 
connected. Define function f as: = max{r„^ : v G V^\u}. We can assume that there 

exists a pair u,v G V with = 2, else the solution is a spanning tree. 

6.1 Algorithm for {1, 2}- VC 

We first present our modified lower bound for {1,2}-VC. We assume that the input 
graph has no cut vertices. 

Lemma 3. If there is a pair u,v G V with r^v = 2, then opt{G) > max(n,2|/|), 
where I CV is an independent set of vertices v with fy = 2. 

Our algorithm has two phases. The first phase is the same as in [5]. It computes a 2- 
VC spanning subgraph E': very roughly, it first finds a rooted DFS spanning tree, then 
adds extra back edges to 2-vertex connect the DFS tree, by processing it bottom-up. In 
the process, blocks are formed on addition of a back edge. The blocks form a partition 
of the vertices, finally. Our second phase is different from that in [5]. 

The Algorithm - Phase 2. After the 1st phase, set E' consisting of the edges of the 
DFS tree and of one back edge out of each non-root block has been output. I — 
initially. The 2nd phase will try for each block B to delete a tree edge within B, or if it 
is impossible to find a vertex s with f ^ = 2, add it to /, such that / remains independent. 
The 2nd phase will also modify the set of back edges. 

Like in the second phase in [5], we traverse the blocks top-down. At each step we 
fix a block, say B, and B decides on the back edges going out of the child blocks of B. 
The first step is made with the root block: it chooses the farthest going back edge from 
each of its child blocks. Any child block having its back edge decided, decides on the 
back edges for its child blocks in a way we will specify. We proceed towards the leaves. 

Let B be some non-root and non-leaf block for which the decision about the back 
edge e = e{B) out of it has been made. Let v be an end vertex of e and v G B, and 
u = u{B) be the other end vertex of e. Let p = p{B) be the parent vertex of B. 

Block Property. We can assume that: (1) the back edge e{B) goes higher thanp(B), 
and (2) there is an u{B) — p{B) path that goes through the ancestor blocks of B. 

The above property is obviously true for child blocks of the root block. We show 
that it is maintained during the algorithm, see Lemma 4. 
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Let w' be the highest vertex in path [v,p) with driw') > 3. If there is no sueh 
vertex in [v,p), then let w' = v. Let w” be the lowest vertex in {w' ,p] whieh has a baek 
edge from some ehild bloek of B. It ean be shown that both w' and w" are well defined. 
We note that the path [w' , w"] has length (i.e. the number of tree edges) at least one. Let 
q be the parent vertex of w' . The algorithm eonsiders the following cases. 

1. Assume that q ^ w" , and there is a vertex s G [q, w") with = 2. Then, we label 
block B and vertex s with MARKED, and add s to the set I. B decides to retain 
the 1st phase choices of the farthest going back edges from all child blocks of B. 

2. Assume that either q ^ w”, and for any vertex s G [q,w"), we have fg = 1, or 
q = w" . In this case the tree edge (g, w') is deleted from the current solution. We 
will show later that this step preserves the connectivities. Let e' be the back edge 
going from some child block B' of B into vertex w" . Then B decides not to take 
the farthest going back edge from B' , but e' instead. For all other child blocks of 
B, B decides to retain the choice of the back edges made by the 1st phase. 

In case 2 of the above algorithm when we delete a tree edge within block B, we pay 
this way for the back edge going out of B. Thus, in these cases the back edge is for free, 
and all such blocks are labeled FREE. Finally, the root block has no back edge going 
out of it and so is labeled FREE. Each leaf block is itself a vertex of requirement two, 
and so we label it MARKED, and choose all the leaf vertices into I. 

Lemma 4. (1) Each case of the 2nd phase of the algorithm maintains the Block Prop- 
erty, and preserves the feasibility of the solution w.r.t. r. (2) The set I of MARKED 
vertices is an independent set in G. Also all the vertices in I have the requirement of 2. 

Theorem 4. The above algorithm is a linear time ^-approximation algorithm for the 
unweighted {1, 2}-VC problem. 

6.2 Algorithms for {1, 2} -EC 

Application of the Previous Algorithm. A straightforward application of the algorithm 
ofSection 6.1 givesa |-approximation algorithm for {1, 2}-EC. 

Simple Algorithm. A simple modification of the algorithm of Khuller and Vishkin 
[11] leads to a | -approximation algorithm for {1, 2}-EC. Let G = (V, E),rbe a given 
instance of {1, 2}-EC. Find a DFS spanning tree of G and keep all the tree edges in our 
solution subgraph H. Whenever the DFS backs-up over a tree edge e, check if e is a 
cut-edge of current H (i.e. none of the back edges in H covers e). If yes, and if the cut 
(S', S) given by e separates a vertex pair x, y with r^y = 2, add the farthest going back 
edge covering e into H. Also, “mark” the cut (S, S), where S is the vertex set of the 
subtree below e, and {x, y} is separated by (S, S) if S has exactly one of x, y. 

The number of tree edges in H is at most n— 1 which is at most opt{G). The number 
of the back edges in H is equal to the number of “marked” cuts (S, S). Because of the 
DFS tree property and the way back edges were chosen, any two such cuts are edge 
disjoint. Thus, the optimal solution to {1, 2}-EC must have at least 2 edges in each of 
these cuts. So, the number of these cuts is at most ^opt{G), and E{H) < ^opt(G). 
Local Optimization Heuristics. Generalizing the local optimization techniques used 
in Section 5, we can obtain the following results. 
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Theorem 5. There are local search based algorithms which achieve a ^-approximation 
for the {1, 2}-VC problem and a ^-approximation for the {1, 2}-EC problem. 

Remark. Theorem 5 generalizes a result for the 2-VC communicated by J. Cheriyan. 

7 Conclusions 

While our improvement for 2-EC is small, it shows that | can certainly be improved. 
With a more careful analysis, it should be possible to improve the approximation factor 
further. We think that similar methods can also improve the approximation ratio for 2- 
VC. We have also presented approximation algorithms for {1, 2}- VC and {1, 2} -EC. 
The algorithms are based on depth-first-search methods and on local search heuristics. 
It is important to note that almost all our algorithms use local search as the main method 
or as a subroutine. We think it would be interesting to further develop applications of 
the local search paradigm in the area of approximating connectivity problems. 
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Abstract. We give a uniform proof for the recognizability of sets of fi- 
nite words, traces, or N-free pomsets that are axiomatized in monadic 
second order logic. This proof method uses Shelah’s composition theorem 
for bounded theories. Using this method, we can also show that elemen- 
tarily axiomatizable sets are aperiodic. In the second part of the paper, 
it is shown that width-bounded and aperiodic sets of N-free pomsets are 
elementarily axiomatizable. 



1 Introduction 

In theoretical computer science, the notion of a recognizable subset of a monoid 
and more generally of an algebra is of outstanding importance. Here, recogniz- 
ability means to be recognized by a homomorphism into a finite algebra. Often, 
this algebraic notion is equivalent to the more combinatorial notion of regularity, 
i.e. acceptance by a finite automaton. Seen as subset of an algebra, recognizable 
sets can often be described by certain rational expressions. 

On the other hand, often the elements of the algebra in consideration carry an 
internal structure. For instance, words can be seen as labeled linear orders, terms 
as labeled ordered trees, Mazurkiewicz traces as dependence graphs etc. If such 
an internal structure is present, it is natural to consider sets of such structures 
that share a typical property. Classical results state that properties expressed in 
monadic second order logic give rise to recognizable sets. This holds for words, 
terms, Mazurkiewicz traces, computations of stably concurrent automata, local 
traces, and many others. If one restricts the expressibility of the logic, corre- 
sponding restrictions of the set of recognizable sets can be described. 

In the first part of this paper, we are interested in the fact that logically 
expressed properties give rise to recognizable sets. Several very different proofs 
can be found in the literature: Using closure properties of the set of regular 
sets of words [17] as well as of traces [18], Thomas shows that any monadically 
axiomatizable set of words or traces can be accepted by a finite (asynchronous) 
automaton. An alternative proof by Ladner [12] for words uses Ehrenfeucht- 
Fra'isse-games. Courcelle [1] interprets the counting monadic second order theory 
of a graph in the monadic second order theory of the generating term, then he 
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appeals to Doner’s Theorem [3]. Ebinger and Muscholl [7] interpret the monadic 
theory of a trace in the monadic theory of its linear representations and then use 
Biichi’s theorem for words. This method is extended to concurrency monoids in 
[5]. In [4,10,11], the respective result is shown using the observation that any 
monadically axiomatizable set is the projection of a monadically axiomatizable 
set of traces. 

It is the aim of the first part to give a uniform proof method for the fact 
that monadically axiomatizable sets are recognizable. This is achieved by model 
theoretic methods, in particular a weak form of Shelah’s composition theorem 
[16]. For words, the idea is the following: Suppose one wants to know whether a 
monadic sentence (p holds in the concatenation of two words v and w. Then it 
is not necessary to know the words v and w completely, but it suffices to know 
which monadic sentences are satisfied by v and by w, respectively. Even more, 
one can restrict attention to monadic sentences of the quantifier depth of p. 
This composition theorem defines a semigroup structure on the set of bounded 
monadic theories of finite words and a homomorphism from the free semigroup 
onto the semigroup of bounded theories. As this semigroup is finite, any monad- 
ically axiomatizable set is recognizable. This idea is worked out more generally 
for D-algebras of graphs (see below for a precise definition) . As corollary, we ob- 
tain one direction of Biichi-type theorems for words, for semitraces, for traces, 
and for N-free pomsets. The result for N-free pomsets generalizes a result from 
[11] where we had to impose additional restrictions on the N-free pomsets. 

Dealing with first order logic, we again construct an algebra of bounded 
elementary theories and show that any binary operation in this algebra is an 
aperiodic semigroup. This implies one direction of McNaughton and Papert’s 
theorem for words [14], for semitraces, for traces [7], and for N-free pomsets 
where it is new. 

The second part of the paper exclusively deals with N-free pomsets, in partic- 
ular with the question whether aperiodic sets are elementarily axiomatizable. It 
is shown that width-bounded and aperiodic sets of N-free pomsets are starfree; 
the restriction to width-bounded sets originates in our proof, but we conjecture 
that it cannot be droped in general. In particular we believe that the set of trees 
is not starfree (it is aperiodic since it is elementarily axiomatizable). Finally, we 
show that starfree sets are elementarily axiomatizable. The only difficulty in this 
proof is the parallel product of elementarily axiomatizable sets. This problem is 
solved using Gaifman’s Theorem and the observation that connected components 
in an N-free pomset are elementarily definable. Thus, we obtain the equivalence 
of starfreeness, aperiodicity, and elementary axiomatizablity for sets of N-free 
pomsets of bounded width. 

2 Basic Definitions 

2.1 Graphs and D-Sums 

Throughout this paper, let A be a finite set. We will consider finite directed 
graphs whose vertices are labeled by the elements of A, i.e. structures {V, E, A) 
where E is a finite nonempty set of vertices, E C V x V is the set of edges. 
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and A : F ^ A is a labeling function. For short, we call these structures graphs. 
The set of all graphs is denoted by G. The element a of A is identified with the 
one- vertex graph labeled by a that does not contain any edge. 

Example 2.1. We can, as usual, identify a word w € with the labeled linear 
order ({1,2,..., |rt;|}, <, A) where A(i) is the i-th letter of w. This is a particular 
graph. Then the concatenation of words v-w corresponds to the disjoint union of 
the sets of positions. The edge relation of the product is the edge relation of the 
arguments together with a new edge from any position in v to any position in 
w. This operation, that can be applied to any two graphs s and t in G, is called 
sequential product and we write s-t for the sequential product of s and t. If s and 
t are orders, this operation is known as lexicographic sum in the mathematical 
literature. A graph (V, E, A) corresponds to some word iff it can be constructed 
from the letters a € A by the sequential product. 

Example 2.2. The possibly simplest operation on the set of graphs G is the dis- 
joint union, also known as parallel product s || t. A graph that can be constructed 
from the singletons by the sequential and the parallel product is called series- 
parallel graph. One can easily show that the edge relation of any series-parallel 
graph is actually a (strict) partial order. Furthermore the partial order {N, <n) 
cannot be embedded into a series-parallel graph. Since these properties charac- 
terize the series-parallel graphs (cf. [8]), we will speak of N-free pomsets. The set 
of all N-free pomsets is denoted by NF. 




The poset {N, <n) 



Note that the parallel product is not only associative, 
but also commutative. A structure (S', •, ||) where S is 
a set, • and || are associative operations on S, and || 
is in addition commutative, is called sp-algebra [13]. 
Thus, (NF, •, II) is an sp-algebra. 



Example 2.3. Let SD C A x A be reflexive. Then one can define the following 
operation on graphs: For s = (Vg, Eg, Xg) and t = (Vt, Et, Xt), the semitrace 
product of s and t is the graph (V, E, A) where V is the disjoint union of Vg and 
Vt, the labeling is preserved, and (x, y) G E iS (x, y) G EgLlEt or x G Vg, y G Vt, 
and (As(x), Xt{y)) G SD. A semitrace over SD is a graph that can be constructed 
from the singletons using the semitrace product. The set of semitraces together 
with the semitrace product is a semigroup generated by A. 



Example 2. 4. Now let D C A x A be reflexive and symmetric. Then the semi- 
trace product from the preceding example becomes the usual trace product as 
considered in the theory of Mazurkiewicz traces. A semitrace over D is then 
called Mazurkiewicz trace or simply trace. The set of traces together with the 
trace product is the well studied trace semigroup M(A, D). 

All the operations in the examples above can be seen as special instances of 
a D-sum that we introduce next: 
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Definition 2.5. A sum description of arity n is a family D = {Di^rn)i,m<n,t^m 
of subsets of A X A. For a sum description D of arity n and graphs Si = 
{Vi, Ei, Ai) for 1 < i < n, the D-sum defined as follows: 

V = {{£,z) \ e <n,z € Vi}, 

{{i, v), {m, w)) e E iff£ yf m and {Xi{v) , Xm{w)) G 

or i = m and {v, w) G Ei, and 
X{i,v) = Xe{v). 

Thus, the _D-sum is the disjoint union of the summands. The edge relation 
within the summands is not altered. Whether two nodes from different sum- 
mands are connected by an edge depends on their labels and is dictated by the 
sum description D. This operation is a special case of the “generalized sum” as 
considered in [16, Def. 2.3]. 

Example 2.6. All operations we presented above, are binary. Let D^ i = 0. With 
D\ 2 = a X a, the D-sum equals the sequential product, with Df 2 = 0 the 
parallel product, and with Df 2 = SD the semitrace product. Hence also the 
trace product is a special instance of our concept of a D-sum. 

The construction of terms (seen as trees) seems at first sight not to fit into 
our setting as the term f{t\,t 2 , ■ . ■ ,tn) contains not only the disjoint union of 
the terms ti, but an additional root node that corresponds to the outermost 
function symbol /. Nevertheless, we can model this construction by a nested 
application of the disjoint union (of the arguments) and the sequential product 
with the singleton graph /. 

Let ® be a set of sum descriptions. By G(D), we denote the set of all graphs 
that can be constructed from the singletons by the D-sums for some D G D. 

Note that G(2)) is closed under the application of the D-sum for D G D. 
Hence (G(D), (X^d)^’^®) is an algebra over the signature D. A D-algebra is a 
structure {S, {f 0 ) 0 ^ 11 ) where fo is a operation on S of the arity of D for any 
D G ®. Hence, G(2)) together with the D-sums for D G D is a D-algebra. 

Note that the D-sum is associative for any binary sum description D. Fur- 
thermore, it is commutative iff B 12 = B 2 p. Thus, the {D}-algebra G({D}) is a 
(commutative) semigroup (if B 12 = D 2 .i)- 

Example 2.6 (continued) The set G({D^}) is the set of labeled linear orders, i.e. 
of those graphs that correspond to words over A. Therefore (G({D^}),X_di) is 
the free semigroup generated by A. Similarly, (G({D^,D^}), (Xi) 0 i=i, 2 ) is iso- 
morphic to the sp-algebra of N-free pomsets (NF, -,11). Finally, (G({D^}), Xds) 
is the semigroup of semitraces and therefore isomorphic to the trace semigroup 
if SD is symmetric. 

A set A C G(D) is recognizable if there exists a finite D-algebra {S, {fo) nen) 
and a homomorphism rj : G(D) ^ S such that X = r]~^rj{X). Note that we 
could have required the homomorphism 77 to be surjective without changing the 
concept of recognizability. 
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2.2 Logic 

Monadic Formulas involve first order variables x,y, z . . . for vertices and monadic 
second order variables X,Y, Z, . . . for sets of vertices. They are built up from 
the atomic formulas A(x) = a for a G A, {x, y) G E, and x G X hy means of the 
boolean connectives V, A, and quantifiers 3,V (both for first order and 

for second order variables) . Formulas without free variables are called sentences. 
The satisfaction relation \= between graphs p = {V, E, A) and monadic sentences 
ip is defined canonically with the understanding that first order variables range 
over the vertices of V and second order variables over subsets of V. 

Let 6 be a set of graphs and (p a monadic sentence. Furthermore, let X = 
{p G C \ p \= (f} denote the set of graphs from C that satisfy (p. Then we say 
that the sentence p axiomatizes the set X relative to G or that X is monadically 
axiomatizable relative to G. 

An elementary formula is a monadic formula that does not contain any set 
variable. A set A C 6 is elementarily axiomatizable relative to G if there exists 
an elementary sentence that axiomatizes X relative to 6. 

The quantifier depth of a monadic formula is defined canonically. For a graph 
s and a positive integer k, let MThfc(s) denote the set of all monadic sentences 
p of quantifier depth at most k that are satisfied by s. The set MThfc(s) is the 
k-bounded monadic theory of s. Analogously, the k-bounded elementary theory 
Thfc(s) comprises all elementary sentences in MThfc(s). By MTH^ and TH^, we 
denote the set of all fc-bounded monadic and elementary theories of graphs. Up to 
logical equivalence, there are only finitely many monadic sentences of quantifier 
depth at most k; hence the set of fc-bounded monadic theories is finite (and the 
same holds for the elementary theories). 

3 Axiomatizable Sets of Graphs 

3.1 Monadically Axiomatizable Sets Are Recognizable 

Theorem 3.1. Let k gN and D be a sum description of arity n. Furthermore, 
let S£ and ti be graphs for i < n. 

(1) If MThkise) = MThfc(t^) for I <n, then MThfe(X)£i se) = MThk{J2D^^)- 

(2) Similarly, ifThk{si) = Thkiti) for £ <n, then Thfc(X)i) se) = Thkif^jyte). 

We already mentioned that the D-sum of n graphs is a special case of the 
generalized sum as considered by Shelah. The preceding theorem follows from 
his result [16, Thm. 2.4]. A condensed proof of the full result can be found in 
[9], for lexicographic sums of linear orders, [19] contains the full proof. In the 
terminology of [2], the first statement says that H-sums are Hintikka operations, 
and it can be derived from their results on quantifier free definable operations. 
Shelah is also interested in the effective computability of the combined theory 
from the argument theories. If one is only interested in the result as stated above, 
an alternative and much simpler proof of both statements can be given using 
Ehrenfeucht-Fraisse-games for the respective logics. 

Again, let H be a sum description of arity n and k G N. Then by the above 
theorem the set MTH^ of /c-bounded monadic theories can be equiped with an 
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n-ary operation fo such that 

MThfc(X;£,(si, S 2 , • ■ • , Sn)) = /r)(MThfc(si),MThfc(s 2 ), ■ • ■ , MThfc(s„)). 
Hence the mapping MTh^ is a homomorphism from the ©-algebra of graphs 
onto the ©-algebra (MTH^, (/^)DeD) for any set of sum descriptions ©: 

Corollary 3.2. Let T> be a set of sum descriptions. The set of k-bounded 
monadic theories MTH^ can be equiped with the structure of a T)-algebra such 
that MThfc : (G, ^ MTH^ is a homomorphism between D-algebras. 

Now let be a sentence of the monadic second order logic of quantifier 
depth k. Since t \= ip iE tp & MTHfc(t), this homomorphism recognizes the set 
{t G G(©) I t 1= (/?}, i.e. we showed 

Theorem 3.3. Let p be a sentence of the monadic second order logic. Then 
{t G G(©) \ t\= p} is recognizable in (G(©), (X)_D)-De®)- 

For © finite, this result follows from [2], but their technique of interpreting 
a graph in a generating term cannot be applied for © infinite. 

Corollary 3.4. 1. Any monadically axiomatizable set of words, traces, or semi- 

traces can be recognized by a homomorphism into a finite semigroup. 

2. Any monadically axiomatizable set of N-free pomsets can be recognized by a 
homomorphism from (NF, •, ||) into a finite sp-algebra. 

This corollary generalizes Biichi’s Theorem for finite words as well as a result 
by Thomas [18] on traces. The result on semitraces is new. For N-free pomsets, a 
weaker statement was shown in [11]. There, we had to require that the pomsets 
t G NF that satisfy p are width bounded. 

3.2 Elementarily Axiomatizable Sets Are Aperiodic 

Let p be an elementary sentence. There exists a finite ©-algebra and a homo- 
morphism into this algebra recognizing the set of graphs that satisfy p. In this 
section, we want to derive additional properties of this finite ©-algebra. 

Let © G © be a binary sum description. To simplify the notions, we will 
write s-l-_B t for ^g{s,t). A ©-algebra {S, {fD)De'i>) is aperiodic iff (5',/s) is 
an aperiodic semigroup for any binary © G ©. A set X C G(©) is aperiodic if it 
can be recognized by a homomorphism into a finite aperiodic ©-algebra. In the 
following example, we consider some aperiodic sets and their closure properties: 

Example 3.5. Obviously, a homomorphism that recognizes X C G(©) recognizes 
G(©) \ A as well. The set of aperiodic ©-algebras is closed under finite direct 
products. Hence the union as well as the intersection of aperiodic sets in G(©) 
are aperiodic, i.e. the set of aperiodic sets is closed under Boolean operations. 

Next consider the following sp-algebra: S = {1, 2}, s-s' = 1 and s ]j s' = 2 for 
any s, s' G S. Since, for s G S, we have s-s=l = s-s-s and s jj s = 2 = s jj s jj s, 
the sp-algebra (©,•,[[) is aperiodic. Then the mapping r] : NF — > {1,2} with 
rjfs) = 1 iff s is connected is a homomorphism. Hence the set of connected 
N-free pomsets as well as its complement NF |j NF are aperiodic. 
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Adopting the above methods for monadic second order logic, we can define 
the D-algebra of fc-bounded elementary theories (TH^, (/D)_De®) for any set of 
sum descriptions D. Then the mapping Th^ from the set of graphs onto this ®- 
algebra is a homomorphism. Similarly to the monadic case, it is immediate that 
any elementarily axiomatizable set can be recognized by the homomorphism Th^ 
for some k. Thus, the core of this section is to prove that (TH^, Jb) is aperiodic 
for any binary sum description B. 

Theorem 3.1 relates two U-sums for some fixed D. Next, we will prove a 
similarly looking result for different H-sums. To compensate this more general 
situation, we require that all argument graphs si and tm have the same k- 
bounded elementary theory T. In addition, we have to relate the different sum 
descriptions that will appear in the lemma. Loosely speaking, they have to have 
the same fc-bounded elementary theory. To give this intuition a clear meaning, 
we consider an n-ary sum description D as, a, relational structure R{D) on the 
carrier set {1,2,..., n}. For any M C A x A, the relational structure R{D) has 
a binary relation {{i,j) \ Dij = M}. 

Theorem 3.6. Let k € N, and let and be sum descriptions of arity n\ 
and U 2 , respectively. Assume that the associated relational structures R{D^) and 
R{D^) have the same k-bounded elementary theory. Furthermore, let Sx and ty 
be graphs with Thfc(sa,) = Fh.k{ty) for x <n\ and y < U 2 . Then Thfc(^^i Sx) = 

Thfc(^^2 ty). 

Proof. The proof uses Ehrenfeucht-Fraisse-games, see [6, p. 16 ff.] for an intro- 
duction. Following the notation in [6], we write Gk{s,t) to denote the game on 
the structures s and t with k rounds. 

By the requirements on D^, , Sx and ty, Duplicator has a winning strategy 

for the games Gk{R{D^),R{D‘^)) and Gk{sx,ty) for x < n\ and y < U 2 . A 
winning strategy for Sa:, ty) is described as follows: Suppose in the 

first £ < k rounds. Spoiler and Duplicator played {{xi,ai),{yi,bi))i<i<i with 
Xi < ni, yi < U 2 , G Vs„. and bi G Vt,,,. Now, in round £-1-1, Spoiler chooses 
{x,a) with X < ni and a G lAa,- Note that {xi,yi)\<i<i can be seen as the result 
of £ rounds in the game Gk{R{D^), R{D“^)). The winning strategy for this game 
tells Duplicator which y < ri 2 to choose. Now let 1 < < • • • < in < £ 

be those indices for which yi. = y. Since Duplicator always plays according to 
the strategy we are describing, we obtain Xi^ = x for all these indices. Hence 
(aq, &q)i<j<n is the result of n rounds of the game Gk{sx,ty) and Duplicator’s 
winning strategy for this game tells him how to answer to Spoiler’s move a. □ 

A slightly weaker form of the above theorem follows from [16]. Namely, the 
structures R{D^) and R{D^) would have to coincide in their £-bounded theory 
where £ can be computed effectively from k. In our case of D-sums, it suffices to 
take £ = k. 

Corollary 3.7. Let k gN and let B = (Hi. 2 , H 2 . 1 ) be a binary sum description. 
Then the semigroup of all k-bounded elementary theories (THfc,/s) is aperiodic. 
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Proof. Let s be a graph. 

For n € N, let D" be the following n-ary sum description defined by Dfj = 
Bi ^2 and ^ = S 2 ,i for 1 < i < j < n. Since is associative, we obtain 
y~l £)n+i (si , S2) ■ • ■ ) Sn, s) = X/ £)" • ■ • ,Sn) P B S. 

A special case of [12, Lemma 4.4] states that 

Thfc({l,2, . . . ,2^= - 1}, <) = Thfc({l,2, . . . ,2'=}, <). 

Note that the associated relational structure has the carrier set {1, . . . , n}. 

Furthermore, a pair (i,j) with i yf j is labeled by B 12 iff i < j, and by i? 2 ,i 
otherwise. Hence the /c-bounded elementary theory of R{D^ equals that of 
R{D^ ). Now the theorem above implies 

Thfc (s-|-_B S+B \-B S) = Xhfc(X]£, 2 fc-l(s, s, s, ...,s)) 

{2^ — 1)— times 

= Thfc(X]£,2fc (s, s, s, • ■ • , s)) = Thfc( s-|-B s +b \~b s) . 

2^ —times 



Since Th^ is surjective, the result follows. □ 

Now let X C G(D) be an elementarily axiomatizable set. Since Corollary 3.2 
holds for elementary theories verbatim, X is recognized by the homomorphism 
Thfc into the finite ©-algebra of fc-bounded elementary theories for some k. By 
Corollary 3.7, this ©-algebra is aperiodic. Hence we obtain 

Theorem 3.8. Let ip be an elementary sentence and T> be a set of sum descrip- 
tions. Then {t G G(©) \ t\= p} is an aperiodic set in (G(©), (X)d)£>6®)- 

As a corollary, we obtain the known results on finite words [14] as well as on 
finite traces [18,7]. The corresponding statements for semitraces and for N-free 
pomsets are new. 

Corollary 3.9. 1. Any elementarily axiomatizable set of words, traces, or 

semitraces can be recognized by a homomorphism into a finite aperiodic semi- 
group. 

2. Any elementarily axiomatizable set of N-free pomsets can be recognized by a 
homomorphism from (NF, •, ||) into a finite aperiodic sp-algebra. 

4 Aperiodic Sets of N-Free Pomsets Are Elementarily 
Axiomatizable 

By [14,18,7], the inverse of Corollaries 3.4(1) and 3.9(1) hold for words and for 
traces. Let X C NF be a recognizable set of N-free pomsets of bounded width. 
Then, according to [11], X is monadically axiomatizable, i.e. in this restricted 
setting the inverse of Corollary 3.4(2) holds as well. It is the aim of this section 
to show similarly the inverse of Corollary 3.9(2), i.e. to show that any aperiodic 
set of N-free pomsets is elementarily axiomatizable. This is shown via starfree 
sets of N-free pomsets that we consider next. 
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4.1 Starfree Sets Are Elementarily Axiomatizable 

The sequential and the parallel product of N-free pomsets can easily be extended 
to sets of N-free pomsets. By A+, we denote the sequential iteration of X, i.e. 
the set {pi ■p 2 ' • - Pn \ n > 0,pi G X}. Recall that we did not allow a poset to be 
empty. Therefore, in general, X ■ NF does not contain X. Since, occasionally it 
will be convenient to have this, we will use the abbreviations X -NFe = X -NFU A 
and similarly NF^ • A = NF • A U A for A C NF. 

The class of starfree languages is the least class C of subsets of NF containing 
{s} G C for s G NF that is closed under the Boolean operations and the sequential 
and parallel product. 

Example 4-1- The width w{t) of a poset t is the maximal size of an antichain in 
t. For yfc G N, let NF^ = {t g NF | wft) < k}. The set NF^ • (NF || NF) • NF^ 
of all N-free pomsets of width at least 2 is starfree. Hence, the set NFi of linear 
pomsets is starfree, too. By Example 2.1, words over A can be identified with 
linear pomsets, i.e. with the elements of NFi. Then, of course, A+\L corresponds 
to NF i\L' where L' C NF i corresponds to L. Hence, starfree word languages 
correspond to starfree subsets of NF. 

We saw above that NFi is starfree. Note that, for any fc G N, the set NFg • 
((NF \ NFfc_i) II NF) • NFg is the set of N-free pomsets of width at least k+ \ . 
Hence, its complement NF^ is starfree. 



Theorem 4.2. Let X C NF he starfree. Then X is elementarily axiomatizable. 

Proof. As usual, this can be shown by induction on the construction of starfree 
sets. The base case as well as the Boolean operations and the sequential product 
are easily dealt with. To handle the parallel product, we have to invest a bit more 
work: Let Aj C NF be axiomatized by (pi for i = 1, 2. The distance between two 
elements of an N-free pomset is 0, 1, or 2 (if they lie in the same connected com- 
ponent), and oo if the two elements belong to different connected components. 
Hence we can infer from Gaifman’s Theorem [6] that any elementary sentence is 
equivalent to a Boolean combination of statements of the form 
“there are < n connected components C CV with Thfc(C, <, A) = T” (*) 

for some bounded theory T G TH^. 

Let t = (R, <,A) G NF and x,y G V. Then x and y lie in the same con- 
nected component of t iff they are bounded from above or from below. Hence 
the statements of the form (*) are expressible by an elementary sentence ipin, T). 

Thus, for any sentence (f, there exists a Boolean combination if of sentences 
fj{n,T) such that t \= (f iS t \= for any N-free pomset t G NF. The sentence 
/\'if{n,T) A /\^tp{n,T) (**) is a particularly simple such Boolean combination 
where the two conjunctions run over finite subsets of N x TH^. Let and <52 be 
two conjunctions of the form (**). Then we define a new such formula 6 as the 
conjunction of the following formulas (where T G TH^): 
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4>{ni + U2,T) if ij){ni,T) occurs positively in <5i and 
'ip{n2,T) occurs positively in 62, 

ip{rii,T) if il){ni,T) occurs positively in 5i (for i = 1,2), and 

+ n 2 — 1, T) if ij){ni,T) occurs negatively in and 
'ip{n2,T) occurs negatively in (52- 

Now one can check that an N-free pomset t satisfies S iff it is the parallel 
product of N-free pomsets ti and t2 that satisfy and 62, resp. 

Coming back to Xi || X2, we can w.l.o.g. assume that 1^1 and ip2 are Boolean 
combinations of sentences Even more, one can write ipi as disjunction 

of conjunctions of the above form (**). Combining any pair of disjuncts from (pi 
and (f2 in the above described manner, we obtain a sentence (p that axiomatizes 
Xi II X 2 . □ 

4.2 Aperiodic Sets of N-Pree Pomsets Are Starfree 

Let S' be a set and let Xg C NF for s G S. Then the set of sets starfree over 
{As I s G S} is the least class C C T(NF) containing A^ for s G S, {t} for 
t G NF, that is closed under the Boolean operations and the sequential and 
parallel product. Note that A C NF is starfree if it is starfree over the empty 
family. 

Example 4-3 ■ We show that the sequential iteration of a starfree subset of NF || 
NF is starfree: More generally, let L C (NF || NF) U A. Then L+ is starfree 
over {L} which, in case L is starfree, implies the starfreeness of L^: A minimal 
sequential factor of a poset t is a poset t' G (NF || NF) U A such that t G 
NFg • t' ■ NFe- Now one easily observes that L~^ is the set of all pomsets from NF 
all of whose minimal sequential factors belong to L. Hence 

L+ = NF \ (NFe • (NF \ T n ((NF || NF) U A)) • NF^), 
a set starfree over L. 

Let S' be a set. Then S+ denotes the set of nonempty words over S. We 
denote the concatenation operation of words over S by ©. Now let / : (NF || 
NF) U A ^ S be a function. By [8], the semigroup (NF, •) is freely generated by 
(NF II NF) U A. Hence, we can uniquely extend the function / to a semigroup 
homomorphism (also denoted /) from (NF,-) to (S, ©). For a set of words K C 
S+, the set f~^{K) = {t G NF | f{t) G K} is a set of N-free pomsets. The 
following lemma is shown by induction on the construction of the starfree word 
language K. 

Lemma 4.4. Let f : (NF || NF) U A — > S be a function and let K he a starfree 
word language. Then K' = f~^{K) C NF is starfree over |/“^(s) | s G S}. 



Lemma 4.5. Let X C NF be recognized by a homomorphism rj : (NF,-,||) ^ 
(S, -,11) into the finite aperiodic sp-algebra (S, -,11). Then X is starfree over 
{rj-\s) n (NF II NF) I s G S}. 
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Proof. Let / : (NF,-) ^ (S'+,©) be the uniquely determined semigroup ho- 
momorphism with f{t) = r]{f) for t G (NF || NF) U A. ^From the free, finitely 
generated semigroup we have the canonical semigroup homomorphism 

a onto (S, •). Then rj = ao f. 

By [15], K := a~^{ri{X)) C S'+ is a starfree word language since (S',-) 
is a finite aperiodic semigroup. By Lemma 4.4, C NF is starfree over 

{/~^(s) I s G S}. Since f~^{s) is the union of ? 7 “^(s)n(NF || NF) and ? 7 “^(s)nA, 
the set f~^{K) is starfree over {? 7 “^(s)n(NF || NF) | s G S}. Since f~^{K) = X, 
the result follows. □ 



Lemma 4.6. Let X C NF jj NF be aperiodic. Then there exist n G N and 
Ki, Li G NF aperiodic for 1 <i <n such that X = Ui<i<n II Li- 

Proof. There exist a finite aperiodic sp-algebra (S, - ,11) and an sp-homomorphism 
rj : NF ^ S such that X = rj~^rj{X). Let G = {(si,S 2 ) G S x S | si || S 2 G 
r]{X)}. Note that rj~^{s) C NF is aperiodic for any s G S. Now one can show 

^ = U{»7"^(si) II ??"^(S2) I (S1,S2) G G}. □ 



Lemma 4.7. Let k G N and let X C NF^ be starfree over {Tg | s G S} = IK 
where Yg C NF for s G S. Then X is starfree over {Yg n NF^ I s G S, 1 < £ < 
k} = lKk 

Proof. One actually shows that X n NF^ is starfree over whenever X is 
starfree over !K. This is done by induction on the starfree construction of the set 
X (and not by induction on k as one might think). □ 



Theorem 4.8. Let k gN and let X C NF^ be aperiodic. Then X is starfree. 

Proof. The theorem is shown by induction on k. By Example 4.1, Schiitzenber- 
ger’s Theorem [15] implies the theorem for fc = 1. Now suppose the theorem holds 
for all £<k and let X C NF^. Then X = {X n (NF || NF)) U {X \ (NF || NF)). 

By Lemma 4.6, X n (NF || NF) is a finite union of sets K \\ L with K,L C 
NFfc_i aperiodic. Thus, by the induction hypothesis, K and L and therefore 
X n (NF II NF) are starfree. In particular, we showed the theorem for aperiodic 
sets contained in (NF || NF) n NF^. 

Next, we deal with X \ (NF || NF): It is recognized by an sp-homomorphism 
r] : NF ^ S into a finite aperiodic sp-algebra (S', -, ||). By Lemma 4.5, X \ (NF || 
NF) is starfree over {t 7 “^(s) n (NF || NF) | s G S}. By Lemma 4.7, it is therefore 
starfree over {//“^(s) n (NF || NF) n NF^ \ s G S,£ < k}. Note that any of the 
sets r]~^{s) n (NF || NF) n NF^ is aperiodic and contained in (NF || NF) n NF^. 
Thus, by what we showed above, they are starfree. □ 
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Abstract. We present a new proof technique for collapse results for 
first-order queries on databases which are embedded in N or Rjso- Our 
proofs are by means of an explicitly constructed winning strategy for 
Duplicator in an Ehrenfeucht-Fraisse game, and can deal with certain 
infinite databases where previous, highly involved methods fail. Our main 
resnlt is that first-order logic has the natural-generic collapse over (N, ^ 
, +) for arbitrary (i.e., possibly infinite) databases. Furthermore, a first 
application of this result shows the natural-generic collapse of first-order 
logic over (Rj,o,^,+) for a certain kind of databases over R^o which 
consist of a possibly infinite number of regions. 

Classification: Logic in Computer Science, Database Theory. 



1 Introduction 

One of the issues in database theory that have attracted much interest in recent 
years is the study of relational databases which are embedded in a fixed, possibly 
infinite structure. This occurs, e.g., in current applications, such as spatial or 
temporal databases, where data are represented by (natural or real) numbers 
(for a recent comprehensive survey see [6]). In many applications, the numerical 
values only serve as names which are exchangeable. If the underlying structure 
is linearly ordered, it is often only the relative order of the data which is of 
interest. For this situation, locally order generic queries have been studied, i.e., 
queries which commute with every order-preserving embedding of the active 
domain into the underlying structure. One central theme here is the question, 
how much additional power a language, such as first-order logic can gain for 
locally order generic queries, by using additional (e.g., arithmetical) predicates of 
the underlying structure. It is well-known that with addition and multiplication 
first order-logic can, indeed, express more locally order generic queries over N 
than with order alone; for other cases, however, these investigations have led 
to so-called collapse results for first-order queries. As an example, take the 
following theorem (cf. [3], Proposition 3.6.4). 

Theorem. First-order logic has the natural- generic collapse over (N, ^,+) for 
finite databases. □ 
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In this theorem, databases are considered the active domain of which is a finite 
subset of N. A first-order query has natural semantics, if quantifiers are inter- 
preted over all of N. The theorem states that, under natural semantics, every 
locally order generic first-order query which uses (apart from the database re- 
lations) ^ and -I-, can be equivalently replaced by one which uses only ^ (and 
the database relations). In other words, under the stated hypotheses, addition 
does not add to the expressive power of first-order logic. 

Theorems like this can be derived from general collapse results in [1,2,4]; for 
an overview see [3]. The proofs for these results are rather involved; they use 
non-standard models, and are non-constructive. 

Our contribution in this paper is twofold: We extend the natural-generic col- 
lapse of first-order logic for the case of (N, -I-) as underlying structure from fi- 

nite to arbitrary databases; and for the case of underlying structure (Mj,o) ^) +)> 
where it was known for finitely representable databases, we show that it also 
holds for nicely representable databases (precise formulations are given in Sec- 
tion 3). In both cases, we transcend the finite limitations of the previously used 
proof techniques. Moreover, our proofs are constructive in the sense that they 
are obtained using an Ehrenfeucht-Frai'sse game with an explicitly constructed 
winning strategy for Duplicator. 

Due to space limitations, most proofs are omitted in this extended abstract. 
The full paper including complete proofs of all our claims can be obtained from 
http : //www. inf ormatik.uni-mainz . de/~{}nisch/publications .html. 



2 Preliminaries 

We use Z for the set of all integers, N for the set of all non-negative integers, R 
for the set of all reals, and R^o for the set of all non-negative reals. 

Depending on the particular context, we will use x as abbreviation for a se- 
quence xi, . . , Xm or a tuple {x\, . . , Xm)- Accordingly, if g is a mapping defined 
on all elements in x, we will write q{x) to denote the sequence q{xi ), . . , q{xm) 
or the tuple {q{x\), . . ,q{xm))- If R is an m-ary relation on the domain of q, 
we write q{R) to denote the relation {q{x) : x € i?}; and instead of x G R we 
often write R{x). We write xi,. . ,Xm TJi, ■ ■ ,ym to denote the mapping q 
with domain {x\, . . ,Xm} and range {yi , . . , j/m} which satisfies q{xi) = yi, for 
all i e {1, ■ . , m}. Throughout the rest of this section let U be N or R^o- 

Databases. A database schema SC is a finite collection of relation symbols, 
each of a fixed aritiy. An SC-database state A over U assigns a concrete foary 
relation C U* to each l-axy relation symbol R G SC. In the literature, 
attention is often restricted to finite databases. Note, however, that we allow 
database relations to be infinite. 

The set adom{A) of those elements of U which occur in one of A’s relations is 
called the active domain of A. If g is a mapping defined on the active domain 
of A, we write q{A) to denote the b'C'-database state with = q{R^), for 

every R G SC. Accordingly, we use q{A,a) as abbreviation for {q{A),q{a)). 
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An m-ary SC -query Q over U is a mapping which assigns to every SC- 
database state A over U an m-ary relation Q{A) C U™. For m = 0, we have 
that Q{A) G {true, false} and call Q a Boolean query. 

Let /C be a class of b'C'-database states over U. We say that Q is FO-definable on 
1C over (U, <) (resp., (U, ^, +)), if there is an m-ary formula (p{x) G FO{SC, <) 
(resp., FO(5'C', ^, +)), such that for every Ag JC and for every m-tuple a G U™ it 
is true that a G Q(A) if and only if (U, A, a) |= (p(x) (resp., (U, +, A, a) ^ 

<f{x)). Here, a formula in FO{SC,^) is a first-order formula in the language 
SCU {=,^}. Similarly, FO{SC,^,+) consists of all first-order formulas in the 
language SCCI {=, ^, +}. 

Ehrenfeucht— Fraisse Games. Our main technical tool will be the Ehren- 
feucht-Frai'sse game. For our purposes, we consider two versions of the game: 
the ^-game and the H — game. Both are played by two players. Spoiler and Du- 
plicator, on two structures, A = (U, .4, a) and B = (U, F, b), where A and B are 
S'C-database states over U, and a and b are sequences of length m in U. 

There is a fixed number k of rounds. In each round i G {1, . . ,k} 

— Spoiler chooses one element, G U in .4, or an element b^"‘l G U in F; 

— then Duplicator chooses an element in the other structure, i.e., an element 

G U in B, if Spoiler’s move was in A, or an element G U in .4, 
otherwise. 

After k rounds, the game finishes with elements . . . , chosen in A and 
b^^l , . . . , b^'^l chosen in B. Duplicator has won the ^-game if, restricted to the 
sequences a, , . . , and b, b^^\ . . , respectively, the structures A and 
B are indistinguishable with respect to =, SC, and i.e., if the mapping 

is a partial ^-isomorphism (see Definition 1 below) 
between (U, A, a) and (U, B, b). Duplicator has won the H — game, if this mapping 
even is a partial + -isomorphism. 

Definition 1 (Partial Isomorphism). Let A and B be SC-database states 
over U, and let a and b be sequences of length m in U. Furthermore, let . . , 
and b^^l , . . , b^^l be elements in U. The mapping tt : , . . , i— > b^^l , . . , 

&(fc) 

is called a partial isomorphism between (U, A, a) and (U, B, b) if the follow- 
ing holds true, where a = , . . , and b = b^^\b^~^\ . . , b^~"^+'^'> . 

(i) iff b^^l = b^^l , for every i,j with —m < i,j < k, 

(ii) G adom{A) iff 6*-*^ G adom{B), for every i with —m < i ^ k, and 

i?-^(a(*i\..,a(*')) zjff &(*')), for every relation symbol R G SC 

of arity, say, I, and for all i\, . . , ii with —m < ii, . . ,ii Gi k. 

7T is called a partial ^-isomorphism if additionally we have 

(iii) iff b^^l < b^^f for every i,j with —m <i,jGi k. 

TT is called a partial H — isomorphism if, in addition to (i)-(iii), we have 

(iv) iff b^^^ Fb^^l = b^^\ for all i, j,l with —m < i, j,l k. □ 
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We write (U, a) =f (U,B,b) (resp., (U, a) (U, 6)) to indicate that 

Duplicator has a winning strategy in the fc-round ^-game (resp., H — game) on 
(U, a) and The fundamental use of the game comes from the fact 

that it characterises first-order logic (cf., e.g., [5]). In our context, this can be 
formulated as follows: 

Theorem 1 (Ehrenfeucht, Praisse). Let K. be a class of SC-database states 
over U. An m-ary SC-query Q over U is not FO-definable on JC over (U, 
(resp., (U, ^,+)j if and only if, for every number A: G N, there are A,B € IC 
and tuples a and b of length m ini] with a G Q{A) and b ^ Q{B), such that 
(U, A, a) =f (U, B, b) (resp., (U, A, d) =+ (U, B, b) ). □ 

3 Collapse Results 

Databases Embedded in (N, +)• 

We call a mapping q : N ^ N order-preserving if, for every m, n G N, we have 
m ^ n iff q{m) ^ q{n). An b'C'-query Q over N is called order generic on an 
b'C-database state A over N iff for every order-preserving mapping <7 : N ^ N, 
it is true that q{Q{A)) = Q{q{A)), i.e., a G Q{A) iff q{d) G Q{q{A)), for every 
a. Our main theorem states that addition does not add to the expressive power 
of first-order logic over the natural numbers for defining order generic queries. 

Theorem 2. First-order logic has the natural-generic collapse over (N, ^,+) 
for arbitrary databases, i.e.: Let Q be an m-ary SC-query over N, and let K. 
be a class of SC-database states over N on which Q is order generic. Lf Q is 
FO-definable on K. over (N, ^,+), then it already is so over (N, ^). □ 

Theorem 2 is a direct consequence of the following result: 

Theorem 3. For every k G N there exists a number r{k) G N and an order- 
preserving mapping g : N ^ N such that, for every database schema SC and 
every m G N, the following holds: Lf A and B are SC-database states over N, 
and if a and b are sequences of length m m N with (N, A, a) (N, ,8, b), then 

{n,q{A,d))=t {n,q{B,b)). □ 

The proof of Theorem 3 will be given in Section 4. 

Proof of Theorem 2: We assume that Q is not FO-definable on 1C over 
(N, <); and we need show that Q is not FO-definable on K, over (N, <,+), 
either. From Theorem 1 we know that, for each number k, there are A,B G K. 
and tuples a and b of length m in N with a G Q{A) and b ^ Q{B), such that 
(N,A,a) {N,B,b). By Theorem 3, then, {N,q{A,a)) {N,q{B,b)). Fur- 

thermore, since q is order-preserving and since Q is order generic on A and B, 
we know that q{a) G Q{q{A)) and q{b) ^ Q{q{B)). Hence, from Theorem 1 we 
obtain that Q is not FO-definable over (N, ^,-1-). □ 

Nicely Representable Databases Embedded in (R 5 ,o, ^, +). 

Let L = (J„)„gN be a sequence of real closed intervals Ln = [/n,r„]. We call I 
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nice if r„ < In+i, for all n G N, and the sequence (r„)„gN is unbounded. 

An b'C-database state C over is called nicely representable if 

— its active domain is for a nice sequence {In)nen of real closed in- 

tervals, and 

— all relations of C are constant on all the intervals, i.e., if R is an m-ary relation 
in C and if x\,yi G . . . , x„i,ym G then we have R{x\, . . ,Xm) iff 

A mapping q : Rj,o ^ is called an order-automorphism of if it is bijective 
and increasing (note that this, in particular, implies that q is continuous). An 
b'C-query Q over R 5.0 is called order generic on an b'C'-database state C over 
iff for every order-automorphism q of R^o> it is true that q{Q{C)) = Q(g(C)), 
i.e., c G Q{C) iff q{c) G Q{q{C)), for every c. 

Theorem 4. First-order logic has the natural-generic collapse over (R^o; 
for nicely representable databases, i.e.: Let Q be an m-ary SC-query over R^o; 
and let K. be a class of nicely representable SC-database states over R^o on 
which Q is order generic. If Q is FO-definable on K. over (R^o,^,-|-), then it 
already is so over (Rj,o,^). □ 

Theorem 4 is a direct consequence of the following result: 

Theorem 5. For every k G N there exists a number r'{k) G N such that, for 
every database schema SC and every m gN, the following holds: IfC and V are 
nicely representable SC-database states over R^o; ond if c and d are sequences 
of length m in R^o with (Rj,o,C,c) =3(fc) 2?, d), then there are order- 

automorphisms q' and q' o/R^o such that {M.^o,q'{C,c)) (R^o) d)). □ 

The proof of Theorem 5 will be given in section 5. We can obtain Theorem 4 from 
Theorem 5 in exactly the same way as we obtained Theorem 2 from Theorem 3. 

4 Proof of Theorem 3 

Our proof of Theorem 3 is an adaption of Lynch’s proof of his following theorem: 

Theorem ([7, Theorem 3.7]). For every fc G N there exists a number d{k) G N 
and an infinite set Q CN such that, for all subsets A, B C Q, the following holds: 
If\A\ = \B\ ord{k) < \A\,\B\ < 00 , then (N,4) =+ {N,B). □ 

Unfortunately, neither the statement nor the proof of Lynch’s theorem gives us 
directly what we need. Going through Lynch’s proof in detail, we will modify 
and extend his notation and his reasoning in a way appropriate for obtaining 
our Theorem 3. 

However, to illustrate the overall proof idea, let us first try to explain intu- 
itively Lynch’s proof. For simplicity, we concentrate on subsets A, B C Q of the 
same size and discuss what Duplicator has to do in order to win the fc-round 
-I — game on (N, A) and (N, B). Assume that, after i—1 rounds have 

been played in (N, A), and b'f^\ . . , in (N, B). Let Spoiler choose some ele- 
ment in (N, 4). When choosing b^l in (N,B), Duplicator has to make sure 
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that, whatever Spoiler can do in the remaining k—i rounds in one structure, can 
be matched in the other. In particular, this means that any sum over the 
behaves in relation to A exactly as the corresponding sum over the behaves 
in relation to B. For instance, for any sets J, J' C {1, . . , i}, it should hold that 
there is some a € Q that lies between ^ if only 

if there is some b € Q that lies between and b^^ . But it is 

not enough to consider simple sums over previously played elements. Since with 
0{r) additions it is possible to generate s ■ from for any s ^ 2’’, we also 
have to consider linear combinations with large coefficients. Furthermore, Spoiler 
can alternate his moves between the two structures, this results in the necessity 
of dealing with even more complex linear combinations. One can only handle 
all these complications because, as the game progresses, the number of rounds 
left for Spoiler to do all these things decreases. This means, for instance, that 
the coefficients and the length of the linear combinations we have to consider 
decrease: after the last round, the only relevant linear combinations are simple 
additions of chosen elements. 

Let us now concentrate on the proof of Theorem 3. 

4.1 Notation 

We first define a function r and, following Lynch, two functions / and g, which 
will be used as parameters in the proof. Of course, all we can say at this point to 
justify this particular choice of functions is that they will make the technicalities 
of the proof work. 



r(0) := 1 , r{i+l) := r{i) + 2*+^ , 

/(0):=1, /(*+!) :=2/(i)4, 

ff(0) := 0 , g{i+l) := 2g{i)f {i)'^ + /(f)! . 

Let /c G N be fixed. Choose any sequence po,pi,p 2 , ■ ■ ■ in N with 

Po = 0, 

Pi ^ 2^^^ f{k)^pi-i + 2g{k)f{kY , for alH > 0 , and 
Pi = pj (mod f{k )\) , for all i,j > 0 . 

Here, r = s (mod n) means that that r—s G n ■ (for r, s G R and n G N). It is 
obvious that such a sequence po,Pi, ■ ■ ■ does exist (cf., [7]). 

We define the order-preserving mapping g : N — > N via q{i) := Pi+i for all 
f G N; and we define Q to be the range of q, i.e., Q := g(N) = {pi,P 2 ,P 3 , ■ • }• 

Let A and B be b'C-database states over N, and let a and b be sequences of 
length m in N. We define 21, a := q{A, a) and $, b := q{B, b). Furthermore, let 
o = and b = b^°\ bfo^\ . . , 

Our assumption is that Duplicator has a winning strategy in the r(/c)-round 
<-game on (N, A, a) and (N, B, b) (which, henceforth, will be called the ^-game). 
Our aim is to show that Duplicator has a winning strategy in the fc-round H — 
game on (N, 2t, o) and (N, 05, b) (which, henceforth, will be called the H — game). 
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For each round i G {1, . . , A:} of the H — game we use and respectively, to 
denote the element chosen in that round in (N, 2t, a) and in (N, IB, b), respectively. 
We will translate each move of Spoiler in the H — game, say (if Spoiler 

chooses in (N, 21, a)), into a number of moves , an} for a “virtual Spoiler” in 

(0 f'i) 

the ^-game. Then we can find the answers b\ , . . , bn- of a “virtual Duplicator” 
playing according to the winning strategy in the ^-game, and we can translate 
these answers into a move for Duplicator in the H — game. (The case when 
Spoiler chooses b*^*^ in (N, $, b) is symmetric.) 

Before we can describe Duplicator’s winning strategy in the H — game, we have 
to fix some further notation: As an abbreviation, for i G {0, . . , k}, we use 
to denote the sequence a, . . , an } , . . , a}\ . . , an} of all positions chosen in 
(N, a) until the end of the i-th round of the H — game. Analogously, we use 
6*-*^ to denote the corresponding sequence in (N, 6). Furthermore, we define 

:= and b^*^ := q{b}}'^); and we use 3^*^ and b^*^ to denote the sequence 

of the g-images of the elements in the sequence and , respectively. Clearly, 
it holds that 3^*\ b^®^ G Q. 

Let i G {0, . . , fc}. A partial mapping c from . • > a^*^}U(5 to {b^^\ . . , b^*^}U(5 
is called an i-correspondence if 

(i) U {3^*^} is in the domain of c, 

(ii) c(a^*^) = b^*^ for all I G {1, . . ,i}, 

(iii) c(3^*^) = b^*\ 

(iv) c(g) G Q if and only if g G Q (for all g in the domain of c), 

(v) c is order-preserving on Q. 

An i-correspondence c is called SC-respecting if it is a partial isomorphism 
between (N, 21, 3) and (N, 03, b) (note that we do not consider < or + here). 

An i-vector in 21 is a sequence s := (xi, . . , x„, ai, . . , o;„, /3), where 

(i) n<2'=-*+i, 

(ii) xi, . . ,Xn are pairwise distinct elements in U Q, 

U ' 

(iii) aj = -y, with Uj, u'j G Z and \uj\, |n' | ^ f{k — i), for each j G {1, . . , n}, 

(iv) /3 G R with |/3| < g{k—i). 

An i-vector in 03 is defined analogously. 

A minor i-vector is an i-vector where additionally we have 

(iv)’ |/3| < g{k-i) - f{k-i-l)l = 2g(fc-i-l)/(fc-i-l)^. 

The elements xi , . . , x„ are called the terms of the i-vector s; ai, . . ,a„ are called 
the coefficients of s; and s := + /3 is called the evaluation of s. 

If c is an i-correspondence and if s = (xi, .., x„, oi, /3) is an i-vector in 

21 whose terms are in the domain of c, then we write c(s) to denote the image 
of s under c, i.e., c(s) is the i-vector (c(xi), . . , c(x„), «i, . . , o;„, j3) in 03. 
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4.2 Duplicator’s Strategy in the H — Game 

We will show that Duplicator can play the H — game in such a way that the 
following four conditions hold at the end of each round i, for i C {0, . . , k}: 

(1) (N,Aa«) =,Vi) 

(2) (mod f{k—i)l) (if i ^ 0) 

(3) The mapping c : b^^\ . . , b^*), b*^*) is an ^C-respecting 

t-correspondence . 

(4) Let d be an i-correspondence and let si and S 2 be i-vectors in 21 whose 
terms are in the domain of d. Then ^ ^ ^ if and only if d(si) ^ d(s 2 )- 

It should be clear that if the conditions (3) and (4) are satisfied for i=k, then the 
mapping c defined in condition (3) is a partial H — isomorphism between (N, 21, a) 
and (N, 23, b), and hence Duplicator has won the game. 

We first remark that the four conditions are satisfied at the beginning: 

Lemma 1. If {N,A,a) {N,B,b), then (l)-(4) hold for i = 0. □ 

We now assume that (l)-(4) hold for i—1, where i G {1, . . , k}; and we show that 
in the i-th round Duplicator can play in such a way that (l)-(4) hold for i. Let 
us assume that Spoiler has chosen in (N, 21, a) (the case when Spoiler has 
chosen b*^®^ in (N, 23, b) is symmetric). 

We first determine two (i— l)-vectors, or minor (t— l)-vectors, Sm and sm 
which approximate from below and from above as closely as possible: If a*^®^ € 

. . , o^®“^^}UQ, we take Sm = sm = (a^®\ 1, 0). If a^®^ ^ • • , a^®~^^}UQ, 

but there is some (i— I)-vector s with s = a*^®\ we take Sm = sm = s. If there is 
no (t— I)-vector s with s = a*-®\ then let Sm be a minor (i— l)-vector such that 
is maximal among all minor (t— l)-vectors s with s < a*-®\ and let sm be a 
minor (t— l)-vector such that sm is minimal among all minor (i— l)-vectors s 
with s > a^®^. (In particular, ^ ^ 0.) 

Let ai\..,an} be those terms of Sm and sm that are in Q. ^From the 
sequence , . . , aif} in Q = <;(N) we determine the corresponding sequence 
ai\..,an} in (N,.4, a) via := These are the moves for the “vir- 

tual Spoiler” in the ^-game. ^From condition (1) (for t— 1) we know that 



Thus, the “virtual Duplicator” can find answers bi \ . . , bnl in (N,,8, b) such that 
(N, .4, a(®-i) , , . . , a«) (N, B, 6(®-D, , . . , 6^]) . 



Since Ui < 2^®“®+^ and by the choice of r, we have r{k—i+l) — r{k—i), and 

hence condition (1) is satisfied for i. 

Let c be the S'C-respecting (t— l)-correspondence obtained from condition (3) 
(for i—1). We extend c to c by defining it also on . . , all} via c(a^*^) := b^*^ := 
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(for all j S {1, ■ ■ Since condition (1) is satisfied for i, c must be 

an S'C'-respecting (t— l)-correspondence with = bbh Furthermore, from 

0 ^ ^ sm and from condition (4) (for t— 1) we obtain 0 ^ c{sm) ^ c{sm)- 

For her choice of in (N, IB, b), Duplicator makes use of the following lemma: 



Lemma 2. 

(a) 's^ = c(sm) (mod f(k—i)!) , and 

(b) if then c(sm) ~ c(sm) > f(k - i)\ . □ 

Duplicator chooses b*^*^ in (N, Q5,b) as follows: 

If = slF then b^*^ := c(sm) ^ 0, and according to Lemma 2 faj we have 
fl(*) = (mod /(fc — i)!). In particular, since e N, this implies that bb) g N. 

If < abl < sJJ' then, according to Lemma 2 (b), we have c(sm) — c(sm) > 
f(k — i)\, and hence there exists a bb) g N with 0 < c(sm) < bb) < c(sm) and 
fl(*) = (,b) (mod f(k — i)!). In both cases, condition (2) is satisfied for i. 

For showing that condition (3) is satisfied for i, we distinguish between the 
two cases “ab) g jab), . . , ab~^)}UQ” and “ab) ^ {ab), . . , ab“i)}uQ”, we make 
use of the fact that c is an 5'C-respecting (i— I)-correspondence with c(abl) = 
bb)j and we make use of the following lemma: 

Lemma 3. IFe have ab) G Q if and only if bb) g Q. □ 

The validity of condition (4) for i is ensured by the following lemma: 

Lemma 4. With Duplicator’s choice of bb) as described above, condition (4) 
holds for i. □ 

Summing up, we have shown that the conditions (l)-(4) hold for i=0. Further- 
more, we have shown for each i g fc}, that if they hold for i— 1, then 

Duplicator can play in such a way that they hold for i. In particular, we con- 
clude that Duplicator can play in such a way that the conditions (l)-(4) hold 
for i=k, and hence. Duplicator has a winning strategy in the fc-round H — game 
on (N, 21, a) and (N, $, b). This completes our proof of Theorem 3. □ 

In fact, our proof shows the following result, which is stronger, but more technical 
than Theorem 3, and which we will use in the proof of Theorem 5. 

Proposition 1. For every fc g N there exists a number r(k) g N and an order- 
preserving mapping g : N — > N such that, for every database schema SC, for 
every m G N, for all SC-database states A and B over N, and all sequences a 
and b of length m in N with (N,Fl, a) (N,B,b), the following holds: 

Duplicator can play the +-game on (N, g(Fl, a)) and (H,q(B,b)) in such a way 
that, for each i Gi k, after the i-th round the situation is as follows: Let q(a) = 
a)°), afo^), . . , afo’”+^) , and g(6) = b)°), b^”^), . . , bfo™+^) . Furthermore let, for 
j g {1, . . ,i}, ab) and bb), respectively, be the elements chosen in the j-th round 
in (M,q(A,a)) and (M,q(B,b)), respectively. The following holds true: 
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The mapping is a partial H — isomorphism between 

(N,2l,a) and (N,$,b), 

aO) = iiU) l^rnod f{k—i)l), for each j with —m < j ^ i, and 

i i i i 






u‘ 

j = — m+l J 



V‘ 

j = — m+l J 



Ua 

j = — m+l J 



V. 

j = — m+1 J 



for all Uj,u'j,Vj,Vj G Z with \uj\, \u'j\, |v'| ^ f{k—i) (for —m < j < i), 

and P,S gM. with \(3\,\S\ ^ g{k—i). □ 



5 Proof of Theorem 5 

Let the functions r and q be chosen according to Theorem 3. Let /c G N be fixed 
and let 5(7 be a database schema. We define r'{k) := 1 + r{k+2). 

Let m G N, let C and T> be nicely representable 5(7-database states over 
R^o; and let c and d be sequences of length m in R^o with (Rj.o,C,c) =3(fc) 

(R^o, P, d). We need to find order-automorphisms q' and q' of R^o such that 
(R^o,g'(C,c)) ='1 {R^o,q'{T^,d))- 

Our proof makes use of Theorem 3. It is structured as illustrated in Figure 1, i.e.: 
In the first step, from C, c and T>, d, we define A, a and B, b, respectively, such that 
(N, A, a) (N, B, b). Since r'{k) — 1 = r(fc+2), we obtain from Theorem 3 

that (N, g(7l, d)) =^+2 (N, 6)). In the second step, we modify g : N ^ N to 

order-automorphisms q' and q' of R^o> and we translate Duplicator’s winning 
strategy in the H — game on (N, q{A, a)) and (N, q{B, b)) to a winning strategy in 
the H — game on (R^o, <Z^(C, c^) and (R5.0, g'(I?, 



(R;so,C,c) =3(fc) (Rs=o,T>,d) (R5,o,g'(C,c)) =f (R;^o, g'(T’, d)) 

JJ- Step 1 fp Step 2 

(N, Gl, a) (N, B, h) " (N, q{A, a)) (N, q{B, b)) 



Fig. 1. The Structure of Our Proof. 



We start with Step 1 of Figure 1. Let the elements in the sequence c be named 
by Since C is nicely representable, its active domain is 

determined by a nice sequence I = (I„)„eN of real closed intervals. We define 
to be the nice sequence of real closed intervals such that {/^ : 
n G N} = {/„ : n G N} U : — m < i < 0, and there is no n G 

N with G In}- Let e be the partial mapping from R^o to N which maps, for 
each n G N, every element in to n. We define A, a := e(C, c). Analogously, we 
define B,b := e(T>,d)t where e is defined in a similar way as e (where, instead 
of I and /°, we use the nice sequence / determined by the active domain of 
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T>, and the corresponding sequence J'^). From our assumption we know that 
C, C) ,T>,d). We can use this winning strategy of Duplicator to 

obtain 

Claim 1: (N,Aa) =3(fc)-i □ 

From Theorem 3 we obtain that (N, q{A, a)) =^+2 ^))- 

We now proceed with Step 2 of Figure 1. Choose some £ S R with 0 < £ < 1. 
For each n G N let /„,r„ G such that 7° = Let q' be an order- 

automorphism of R 5.0 which satisfies the following, for each n G N: q'{ln) = q{n)] 
if In < Tn, then q'{r„) = q{n)+s; and if < • • • < ct^ are exactly those elements 
in the sequence c which lie strictly between /„ and r„, then q'{ci^) = q{n) + 
for each j G {1, . . ,s}. The mapping q' is defined analogously (where, instead 
of In, we use /£). We already know that (M,q{A,d)) =^+2 Using 

Proposition 1, we can translate this winning strategy of Duplicator to obtain 

Claim 2: (R^^o, <?'(C, {R^o, q'{I^,d)). □ 

Hence, our proof of Theorem 5 is complete. □ 

6 Discussion and Open Questions 

Genericity. Intuitively, order generic queries are those which essentially only 
depend on the order of the underlying structure. The formalization used in the 
present paper looks slightly different than the notion of local order genericity 
used, e.g., in [3] and [2]. However, it is not difficult to see that both notions are 
equivalent. 

Natural- Active Collapse. Another form of collapse which is of some interest 
concerns the difference between natural and active semantics. In natural seman- 
tics, quantifiers range over all of U. In active semantics they range only over 
adom{A), implying that only the active domain is relevant for query evaluation. 
Let us mention that Theorem 2 gives us a collapse from natural-semantics order 
generic queries over (N, ^,-1-) to active-semantics queries over (N, for arbi- 
trary databases. 

Possible Extensions. Several extensions are conceivable. One obvious ques- 
tion is whether our results can be extended from N to Z and from R^o to M, 
where in both cases, the active domain is allowed to extend infinitely in both 
directions. Other potential extensions concern the restriction on the database 
over the real numbers. In [2], finitely representable databases are considered. 
Let us mention that meanwhile one of us has found a natural generalization of 
the notion of finitely representable databases, where an arbitrary (i.e., possibly 
infinite) number of regions is allowed (see [9]). 

Other Arithmetical Predicates. One interesting question is: How much 
arithmetic is needed to defeat the natural-generic collapse? It is well-known 
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that with full arithmetic, i.e., with underlying structure (N, no collapse 

is possible, even for finite databases. Consequently, since + is definable from 
* and ^ over N (cf., [8]), FO does not have the natural-generic collapse over 
(N, In [2], Belegradek, Stolboushkin and Taitslin discuss this question in 

the context of finite databases embedded in Z and conjecture that the collapse 
holds over any extension of (Z, ^,+) which has a decidable first-order theory. 
We can support this conjecture with a result which, although of less interest in 
database theory, might be interesting in its own right: Since (N, *) is the weak 
direct product of a; copies of (N, +), we can translate the result of Theorem 3 to 
obtain 

Corollary 1. FO has the natural-generic collapse over (N, *) for arbitrary 
databases. □ 

Effectivity. Finally, there is the question of effective collapse: it would be 
desirable to have an algorithm which, given an order generic FO{SC,^,+)— 
formula, constructs an equivalent FO{SC, ^)-formula. 
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Abstract. We consider the monadic second order logic with two suc- 
cessor functions and equality, interpreted on the binary tree. We show 
that a set of assignments is definable in the fragment E 2 of this logic 
if and only if it is definable by a Biichi automaton. Moreover we show 
that every set of second order assignments definable in E 2 with equality 
is definable in E 2 without equality as well. The present paper is sketchy 
due to space constraints; for more details and proofs see [7]. 



1 Introduction 

This paper lies in the framework of descriptive complexity, an important and 
rapidly growing research area in theoretical computer science. Descriptive com- 
plexity was proposed by Fagin in [3] as an approach to fundamental problems of 
complexity theory such as whether MV equals co — MV. While ordinary compu- 
tational complexity theory is concerned with the amount of resources (such as 
time or space) necessary to solve a given problem, the idea of descriptive com- 
plexity is of studying the expressibility of problems in some logical formalism. 
For instance, in [3] Fagin shows that J\fV problems coincide (over finite struc- 
tures) with the problems expressible in existential second order logic. Since then, 
there has been a large number of results in descriptive complexity. We note that 
most of these results concern finite structures (which are those interesting for 
the applications in computational complexity theory), but studying descriptive 
complexity also over infinite structures makes sense and may lead to a better 
understanding of the expressiveness of various logical syst! ems. 

In particular, in this paper we are interested in the monadic second order 
logic (MSOL) over the binary tree and Biichi automata. 

Monadic second order logic on the binary tree has been long studied since 
the seminal paper [8], where Rabin shows that the set of all true sentences in 
monadic second order logic with two successor functions (let us call Rabin logic 
this logic), interpreted over the binary tree, is decidable. The tool used by Rabin 
for this result are Rabin automata, a kind of finite automata over infinite trees; 
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in particular, Rabin shows that a property is definable in Rabin logic if and only 
if it is definable by a Rabin automaton. 

Biichi automata are also a kind of finite automata on infinite trees; they were 
introduced earlier than Rabin automata, in [2], again as a technique for solving 
decision problems in second order logic, and they have been studied by Rabin in 
[9] (where they are called special automata). It turns out that Biichi automata 
are indeed a special case of Rabin automata, and Rabin in [9] gives an example 
of property definable by a Rabin automaton but not by any Biichi automaton. 
In particular, Biichi automata correspond to a proper subset of Rabin logic. 

Rabin in [9] gives also a logical characterization of Biichi automata by means 
of the weak monadic second order logic (WMSOL), that is monadic second order 
logic where the second order quantifiers range over finite sets. Rabin’s result is 
that a set is definable by a Biichi automaton if and only if it is definable by 
a formula formed by a sequence of existential monadic second order quantifiers 
followed by a weak monadic second order formula. 

In this paper we give another logical characterization of Biichi automata: 
we show that a property is definable by a Biichi automaton if and only if it 
is definable by a formula in the fragment E 2 of Rabin logic. We note that, 
in the original definition of Rabin logic, the prefix ordering of the binary tree 
is considered to be primitive; here instead, our result holds only if the prefix 
ordering is not primitive, since S 2 with prefix ordering is equivalent to the entire 
Rabin logic, as follows from [8]. 

Finally, as a side issue we note that, in the original definition of Rabin logic, 
also the equality relation is considered to be primitive, so one can wonder whether 
E 2 with equality is equiexpressive to E 2 without equality; here we prove that 
the answer is affirmative (at least on second order assignments), by showing that 
every set of second order assignments definable in E 2 with equality is definable 
in E 2 without equality as well. 

The rest of the paper is organized as follows. In Sects. 2 and 3 we recall 
some basic notions about logic and Biichi automata. In Sect. 4 we state the 
main result (that is Biichi equals E 2 ) and the key lemma (that is II i is included 
in W M SOL), and we prove that the key lemma implies the main result. After 
introducing some definition in Sect. 5, in Sect. 6 we sketch the proof of the key 
lemma. In Sect. 7 we prove that equality can be eliminated in E 2 on second 
order assignments. Section 8 is devoted to some concluding remarks. 

2 Monadic Second Order Logic on the Binary Tree 

The kind of logic we are interested in is monadic second order logic. Given a 
set F of function symbols, each of fixed arity, and a set R of first order relation 
symbols, each of fixed arity, we denote by MSOL{F, R) the monadic second 
order logic based on F and R. We recall briefly the syntax of MSOL{F, R) 
terms and formulas. 

First of all, MSOL{F, R) has a countable set of first order variables x,y,. . . 
(ranging over individuals) and a countable set of second order variables X,Y, . . . 
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(ranging over sets of individuals). The (first order) terms of MSOL{F, R) are 
obtained starting from the first order variables by iteratedly applying the func- 
tion symbols belonging to F. The atomic formulas of MSOL{F, R) have the 
form t € X or r{ti, . . . , tn), where t,ti, . . . ,t„ are first order terms, AT is a sec- 
ond order variable, and r is a relation symbol of arity n belonging to R. The 
formulas of MSOL{F, R) are then obtained starting from the atomic formulas 
by applying the boolean connectives A, V, the first order, existential or univer- 
sal, quantifiers 3x, Vx and the second order, existential or universal, quantifiers 
3xyx. 

We recall also that the weak monadic second order logic over F and R, 
denoted by WMSOL{F,R), is the same as MSOL{F,R) with the exception 
that the second order quantifiers range over finite sets rather than on arbitrary 
sets. 

In particular, the logic which we called Rabin logic in the Introduction, and 
which was studied by Rabin in [8], is MSOL{so, si, =, <), where sq, si are two 
unary function symbols (to be interpreted as the successor functions in the bi- 
nary tree) and =, < are two binary relation symbols (to be interpreted as the 
equality relation and the prefix ordering on the binary tree); and the logic we 
are interested in here is the fragment MSOL{sq, si, =) of Rabin logic. 

We recall the definition of the fragments Xn and II n of M SOL. Xq and IIq 
denote the set of first order formulas of MSOL, that is the formulas without 
second order quantifiers; and inductively we define a X„+i formula to be a 
sequence of existential second order quantifiers followed by a ![„ formula; and 
dually we define a 7T„_|_i formula to be a sequence of universal second order 
quantifiers followed by a X„ formula. 

We note that the logics MSOL{sq, si, =) and MSOL{so, si, =, <) are equiv- 
alent, as the prefix ordering is definable in the former; however this is not true 
in general for the corresponding levels Xn' in particular the levels X 2 differ, as 
they correspond to Biichi automata and Rabin automata respectively (on the 
other hand all levels A'm(soj si) =) with m > 3 and A'„(so, si, =, <) with n>2 
are equivalent, as follows from [8]). 

The formulas of MSOL{F, R) can be given a semantics by defining an in- 
terpretation, that is a set U of individuals, and for each function symbol f in F 
a function on U of the same arity as /, and for each relation symbol r in i? a 
relation on U of the same arity as r. 

Given a set U and two natural numbers m, n, let us call assignment of type 
(m, n) (over U) a, m + n-tuple whose first m components are elements of U and 
whose last n components are subsets of U. According to the standard Tarskian 
semantic rules, each formula with m free first order variables and n free second 
order variables defines a set of assignments of type (m, n), that is the set of all as- 
signments which make it true. In particular, let us call second order assignments 
the assignments of type (0,n); then a formula without free first order variables 
and with n free second order variables defines a set of second order assignments. 

In this paper we fix the interpretations of our logical symbols as follows: the 
set of individuals will always be the binary tree {0,1}*, that is the set of all 
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finite words over the alphabet {0, 1}, including the empty word e (the root of 
the tree); the function symbols sq and si will be interpreted as the successor 
functions, sending each word w G {0, 1}* to wO and rul respectively; and the 
symbol = will be interpreted as the equality relation. 

Sometimes we will adopt the usual, convenient genealogic terminology about 
the binary tree, for instance we can say that the words zcO and rcl are the left 
son and the right son of the word w respectively, that w is the father of zcO and 
rul, etc. 

Finally, given a word w, we denote by lie | the length of w. 



3 Biichi Automata on Trees and Assignments 

In this section we review the basic notions about Biichi automata. Recall that, 
given a set E, the powerset of E is the set P{E) of all the subsets of E. 

A Biichi automaton is a system 

A={S,Q,M,Qo,F), 

where A is a finite set (the alphabet of the automaton), Q is a finite set of states, 
M (the move function) is a function from Q x E to P{Q x Q), Qo ^ Q is the 
set of the initial states and F C Q is the set of the accepting states. 

Biichi automata work as follows. A E-tree is a function from {0, 1}* to E. A 
run of the automaton A = {E, Q, M, Qq, F) on a A-tree t is a mapping r from 
{0, 1}* to Q such that, for every w G {0, 1}*, we have 

(r(rcO), r(rul)) G M{r{w),t{w)). (1) 

A run r is called accepting if r(e) G Qo and for every infinite path tt through 
{0, 1}*, and for infinitely many points w G n, we have r{w) G F. We say that 
the automaton A accepts a tree t when it has an accepting run on t. The set of 
trees defined by a Biichi automaton A is the set of all trees accepted by A. 

In particular, if E is the powerset of a set {Vi, . . . of second order vari- 
ables, then the set of all A-trees is isomorphic to the set of all second order as- 
signments of type (0, n); the isomorphism is the map t which sends the tree t to 
Xi, . . . , Xn, where Xi = {w\Vi G t{w)}. So, for this particular kind of A, we say 
that an automaton A on A accepts the second order assignment a = X \ , . . . , A„ 
when it accepts the corresponding A-tree t“^(Ai, . . . ,X„). 

The terminology above can be slightly extended in order to take into account 
first order variables. That is, let m,n be natural numbers, let Vi,. . . , Vm+n be 
second order variables and let A = P({ Ai, . . . , Vm+n})- Given an assignment of 
type (m, n), say a = xi, . . . , Xm, Xi, . . . , Xn, and an automaton A over A, we 
say that A accepts a if it accepts the tree b~"^{{xi}, . . . , {xm}, Xi, . . . , X„). 

So both MSOL formulas and Biichi automata (on suitable alphabets) define 
essentially the same kind of objects, that is sets of assignments, and this allows 
us to compare Biichi definable sets with MSOL definable sets, as we will do in 
the sequel. 
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4 Statement of the Main Result 

Theorem 1. Let m,n be natural numbers and let E be a set of assignments of 
type (m,n) over {0,1}*. 

The following are equivalent: 

1. E is definable by a Biichi automaton; 

2. E is definable by a formula in E 2 {so, si,=). 

The following lemma implies the theorem: 

Lemma 1. (Key Lemma) Let m,n be natural numbers and let E be a set of 
assignments of type (m,n) over {0,1}*. 

If E is definable in IIi{sq,si,=), then E is definable in W M SOL{sq, s\,=) 
as well. 

The proof of this lemma will be given in Sect. 6. Here is the proof of the 
theorem from the lemma. 

To prove that 1) implies 2), we write down in E 2 the condition 



“the automaton A accepts the assignment xi, . . . , Xm,Ki, . . . , X„ (2) 
Let us consider the tree 

t i ({xi},...,{xyj.j}, Xl , . . . , X.fi ) , 

and let us enumerate the states of the automaton A in an arbitrary way: 

Q = {qi, ■ ■ -,qk}- 

We begin by rewriting the condition (2) as follows: 

“there is a k-tuple p = , . . . , Zg^,) of subsets of {0, 1}* such that: 

— p is a partition of {0, 1}*; 

” € UqeQo 

— p is built according to the labeling of t and the move function M of A; 

— and the run associated top is accepting”. 

Now we rewrite each property of the tuple p above as follows: 

— the property 

“p is a partition of {0, 1}*” (3) 

means that the sets Zg are pairwise disjoint and their union is the entire 
{0, 1}*, and all this is expressible in first order logic (over the empty signa- 
ture), hence a fortiori in si); 
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— the property 



( 4 ) 

is equivalent to 

“every nonempty set closed under father meets UgeQo (5) 

which is IIi{so,si) (this formalization of (4) a kind of overkilling, but has 
the advantage of doing without equality, which will be useful in Sect. 7); 

— the property 

“p is built according to t and the move function M of A” (6) 
is equivalent to 

“for every w G {0,1}* and for every triple {q,q',q") of states and for any 
a G P({Vi, . . . , Vm+n}), if w G Zq, wO G Zqi, wl G Zqii and t{w) = a, then 
{q',q")GM{q,a)”, 

which is Z'o(soj si: =)) hence i7i(so)Si)=) (note that the condition “t{w) = 
cr” amounts to a conjunction oi w = Xi or negations, with 1 < i < m, and 
w G Xj or negations, with 1 < j < n); 

— and the property 



“the run associated to p is accepting” (7) 

is equivalent to 

“every infinite path of (0, 1}* meets Uq6_F infinitely often”, (8) 

which turns out to be equivalent to: 

“for every set Y C {0, 1}*, ifY satisfies the following conditions: 

• for every q G F , Zq C Y ; 

• for every w G {0, 1}*, if wO G Y and wl G Y, then w GY, 

then Y = {0, 1}* ”, 

which is 77i(so, si)- 

Summing up, the condition (2) is equivalent to a sequence of existential 
second order quantifiers followed by a 7Ti(so,si,=) formula, that is a 
'£' 2 ( 50 : si) =) formula, as desired. 

Conversely, assuming the lemma, we have that 2f2(so, si, =), which is the 
existential second order closure of IIi{so, si,=), is included into the existential 
second order closure of WMSOL{sq, si, =), which coincides with the class of all 
Biichi definable sets by the classical result of [9] already mentioned; this proves 
that 2) implies 1) and hence concludes the proof of the theorem from the lemma. 
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5 Auxiliary Notations 

5.1 Bimodalities 

Given two words v, w and a second order variable X, denoting a set of words, 
we introduce the second order bimodality {Xw~^)v, whose semantics is the set 
of words {yv\yw G X}. We can view bimodalities as a kind of (second order) 
terms in MSOL, and we can use them in formulas. For instance we can write 
z G (X1~^)0, which means that the word z is the left brother of an element of 
the set X. 

In the same vein, we allow ourselves to take also first order bimodalities, that 
is we introduce the terms {xw~^)v, where v and w are binary words and a; is a 
first order variable. This term will denote the set {yv\yw = x}; note that this 
set has always at most one element. 

For convenience we include among the bimodalities also “constant” expres- 
sions of the kind {Ew~^)v, where E is any fixed set of binary words such as 
{0, 1}*, 0, etc., with the obvious semantics. 

Finally we define the length of a bimodality (Aw~^)v to be |w| -I- |r(;|. 

For more comments on bimodalities see the journal version [7]. 

5.2 Quasiinclusions 

In the next section we will build a normal form for first order logic over the 
binary tree. An ad hoc notion which we need to build our normal form is the 
notion of quasiinclusion. 

Given two sets A, B and a natural number k, we say that A is k-included in 
B, written A Qk. B, if the difference A\B has at most k elements. That is, A is 
included in B up to at most k exceptions. We call quasiinclusion any expression 
of the form A C^, B. 

In particular, we have A C B if and only if A Cq B, so every inclusion is a 
quasiinclusion. 

For more on quasiinclusions see the journal version [7]. 



6 Proof of the Key Lemma (Sketch) 

We are left to prove Lemma 1. To begin the proof, we observe that every for- 
mula in 7Ti(so, si, =) is a sequence of universal monadic second order quantifiers 
followed by a formula in A'o(so) si) =)> that is a first order formula. Now we put 
every formula of Eq(so, si, =) into a normal form as follows. 

6.1 A Normal Form Lemma for First Order Formulas 

Lemma 2. Every formula in Eo{so, si,=) is equivalent to a boolean combina- 
tion of quasiinclusions A Ck B, where k is a natural number, A is a finite, 
nonempty intersection of bimodalities, and B is a finite, nonempty union of 
bimodalities. 
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The proof is omitted for lack of space, see the journal version [7]. 

By using the conjunctive normal form of propositional logic we have the 
following corollary: 

Corollary 1. Every formula in Eq{so, si, =) is equivalent to a finite conjunction 
of formulas of the form 

i/\^i —kt ^i) ^ i\/ —h (®) 

* j 

where ki,lj are natural numbers, Ai,Cj are finite, nonempty intersections of 
bimodalities and Bi,Dj are finite, nonempty unions of bimodalities. 

Since universal quantification distributes over conjunctions, from the previous 
corollary we obtain: 

Corollary 2. Every formula in i7i(so)Si,=) is equivalent to a finite conjunc- 
tion of formulas of the form 

yx.{/\Ai Cfc, B,) ^ (Y Cj Cl. D,), (10) 

* j 

where X is a tuple of second order variables and ki,lj, Ai,Cj, Bi, Dj are as 
above. 

6.2 The Main Argument 

By Cor. 2, and since WMSOL is closed under conjunction, the proof of Lemma 
1 amounts to show that any formula of the form 



yX.{/\A{X,Y) Ck, B,{X,Y)) ^ C\/Cj{X,Y) Cl. D,{X,Y)), (11) 

i 3 

where A is a tuple of second order variables, T is a tuple of parameters (that 
is, subsets of {0, 1}*) and ki, lj,Ai, Cj,Bi,Dj are as in the previous subsection, 
is equivalent to a WMSOL formula. 

For notational convenience, we may abbreviate the formula (11) as 
yX.A{X) Ck B{X) C{X) C_i D{X)] that is, we may suppress indexes and 
parameters from the notation. 

In order to express the formula (11) in the weak logic, we begin with the 
following lemma: 

Lemma 3. Let A{X) Ck B{X) be a finite conjunction of quasiinclusions and 
let C{X) Cl D{X) be a finite disjunction of quasiinclusions, where X is a tuple 
of second order variables, k, I are tuples of natural numbers. A, C are tuples 
of finite, nonempty intersections of bimodalities, and B,D are tuples of finite, 
nonempty unions of bimodalities. 
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The following are equivalent: 

1. yX.A{X) Cfc B{X) ^ C{X) Cl D{X); 

2. For any tuple F of finite sets and for any tuple G of cofinite sets, both of 
the same length as the tuple X, if there is X with F C X C G (where 
the inclusion is intended to hold componentwise) and A{X) C^, B{X), then 
G{F) Qi D{G). 

The proof is omitted for lack of space, see the journal version [7]. 

So, in order to show that the formula (11) is expressible in W M SOL, it is 
enough to show that the property 

“there is X with F C X C G and A{X) B{X) ” (12) 

is expressible in WM SOL. To this aim, we note that the inclusions F C X C 
G together with A{X) C^, B{X) form a finite system of quasiinclusions between 
a tuple of finite, nonempty intersections of bimodalities and a tuple of finite, 
nonempty unions of bimodalities. Hence it is enough to show that the existence 
of a solution of any such system is expressible in W M SOL. The corresponding 
lemma is: 

Lemma 4. Let 



Ai{X,Y) C„^ A){X,Y) a . . . a Ar(X,Y) C„^ A){X,Y) (13) 

be a finite system of quasiinclusions, where X is a tuple of second order 
variables (the unknowns), Y is a tuple of subsets o/ {0, 1}* (the parameters), 
ni, . . . ,Ur is a tuple of natural numbers, Ai, . . . ,Ar is a tuple of finite, nonempty 
intersections of bimodalities and A{, ..., A) is a tuple of finite, nonempty unions 
of bimodalities. Let us abbreviate the system by A{X,Y) C„ A'(X,Y). 

Let N be twice the maximum length of a bimodality of the system. 

The following are equivalent: 

1. The system (13) has a solution, i.e. there is a tuple X of subsets o/ {0, 1}* 
such that A{X,Y) C„ A'{X,Y); 

2. For any finite subset F o/{0, 1}* there is a tuple Xp of subsets of Ball{F,N) 
such that, for any F' C F , we have: 

A{Xp CF',YC F') C„ A'{Xp n Ball{F' , N),Y) (14) 

(where Ball{F,N) is the set of all elements o/ {0, 1}* which are reachable 
from some element of F through at most N father-son or son-father steps). 

The proof is omitted for lack of space, see the journal version [7]. 

Since the point 2) of the lemma above is expressible in WM SOL, Lemma 1 
is proved. Hence, by Sect. 4, Thm. 1 is proved as well. 
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7 About Second Order Assignments 

In the previous sections we stated and proved a theorem which characterizes 
Biichi definable sets of assignments by means of ^ 2 (so, si, =). It turns out that 
if we consider only second order assignments, we can say a little more: the 
equality symbol can be eliminated. That is: 

Corollary 3. Let E he a set of second order assignments. If E is definable in 
'b' 2 (so: si) =); then it is definable in T' 2 (sojSi) os well. 

Proof. By Thm. 1 it is enough to show that, if E is Biichi definable, then E is 
definable in £' 2 ( 50 , si). To this aim we look at the proof of Thm. 1 from Lemma 
1 in Sect. 4, and we observe that, while writing down in S 2 the condition (2), 
the only place where we need equality is when we write down w = Xi for some 
i with 1 < t < to: so if the elements of E are second order assignments, then 
TO = 0 and there is no need of equality. This concludes the proof. □ 

8 Conclusion 

In this paper we have seen that, on the binary tree, Biichi automata are equiva- 
lent to the fragment E 2 of monadic second order logic (without prefix ordering) . 

This has some pleasant consequences. Rabin in [9] exhibits a property which 
is Biichi definable but whose complement is not Biichi definable; since Biichi is 
equal to S 2 , this implies that there is a property which is E 2 definable but whose 
complement is not S 2 definable, which is the first theorem of [5] . Likewise, Hafer 
in [4] exhibits a property which is Rabin definable but which is not definable by 
any boolean combination of Biichi properties; since Biichi equals S 2 and Rabin 
equals Lfa, this implies that there is a property which is but it is not defined 
by any boolean combination of S 2 formulas, which is the second theorem of [5]. 
Thus we found proofs of both theorems of [5] which are considerably more simple 
than the original proofs. 

Another consequence of our result concerns Kozen’s mu-calculus (see [6]); 
that is, on the binary tree, the fragment Illf of the mu-calculus fixpoint alter- 
nation depth hierarchy, which is equal to Biichi by [1] , is equal to the level E 2 of 
monadic second order logic. More generally, it would be interesting to compare 
the various levels of the fixpoint alternation hierarchy of the mu-calculus with 
the levels En of the quantifier alternation hierarchy in monadic second order 
logic, in some interesting class of graphs such as all graphs or all finite graphs: 
we note that trees are not very interesting in this respect, since the monadic hi- 
erarchy on trees collapses to E^ (this is a kind of folklore result, see for instance 
[11]). This kind of problems will be the subject of subsequent papers. 

Note added in proof: an alternate, simple proof that E 2 equals Biichi au- 
tomata has been recently given in [10]. 
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Abstract. Given a hypergraph with nonnegative costs on hyperedge 
and a requirement function r : 2^ ^ Z^, where V is the vertex set, 
we consider the problem of finding a minimnm cost hyperedge set F 
such that for all S F V, F contains at least r{S) hyperedges incident 
to S. In the case that r is weakly supermodular (i.e., r{V) = 0 and 
r{A) + r{B) < max{r(A n J3) + r{A U J3), r{A — B) + r{B — A)} for any 
A, B F V), and the so-called minimum violated sets can be computed in 
polynomial time, we present a primal-dual approximation algorithm with 
performance guarantee dmax'W(rinax), where dmax is the maximnm degree 
of the hyperedges with positive cost, Tmax is the maximnm requirement, 
and Ti.{i) = j is the harmonic function. In particular, our algorithm 

can be applied to the snrvivable network design problem in which the 
requirement is that there should be at least Vst hyperedge-disjoint paths 
between each pair of distinct vertices s and t, for which Vst is prescribed. 



1 Introduction 

Given an undirected graph with nonnegative edge costs, the network design 
problem is to find a minimum cost subgraph satisfying certain requirements. In 
the survivable network design problem (SNDP), the requirement is that there 
should be at least rst edge-disjoint paths between each pair of distinct vertices s 
and t, for which rst is prescribed. It arises from problems of designing a minimum 
cost network such that certain vertices remain connected after some edges fail. 
In the Steiner tree problem which is an important special case, we are given a 
subset T of the vertex set V, and the objective is to find a minimum cost edge 
set to connect all the vertices in T. Clearly this is an SNDP, in which rst = 1 
if s, t € T and rst = 0 otherwise. It is known that the Steiner tree problem is 
NP-hard even for unit cost ([9]). Thus the general SNDP is NP-hard, too. 

We focus on developing approximation algorithms. An a-approximation al- 
gorithm is a polynomial time algorithm which always outputs a solution of cost 
at most a times the optimum. The first approximation algorithm for the SNDP 
is given by Williamson et al. [11]. They formalize a basic mechanism for us- 
ing the primal-dual method. It picks edge sets in r^ax — max{rst} phases, and 
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each phase tries to augment the size of cuts with deficiency by using an integer 
program, which is solved within factor 2 by a primal-dual approach. Their algo- 
rithm has a performance guarantee of 2rmax- In [4] Goemans et al. show that by 
augmenting the size of only those cuts with maximum deficiency, a 27i(rmax)- 
approximation algorithm can be obtained, where 'H{i) = J I® harmonic 

function. For a detailed overview of these primal-dual algorithms, we refer the 
readers to the well- written survey [6]. Recently, Jain [7] shows that there is an 
edge e with x* > I in any basic solution x* of the LP relaxation of the SNDP 
(where the constraint cce € {0, 1} is relaxed to 0 < Xe < 1 for all edge e). Then it 
is shown that an iterative rounding process yields a 2-approximation algorithm. 

In a very recent paper [8], Jain et al. considered the element connectivity 
problem (ECP). In that problem, vertices are partitioned into two sets: terminals 
and non-terminals. Only edges and non-terminals, called the elements, can fail 
and only pairs of terminals have connectivity requirements, specifying the least 
number of element-disjoint paths to be realized. (Note that only the edges have 
costs.) The SNDP is a special case of the ECP with empty non-terminal set. 
Following the basic algorithmic outline established in [11] and [4], they show 
that a 27f(rmax)-approxirnation algorithm can be obtained. 

In this paper we consider the SNDP in hypergraphs (SNDPHG). The dif- 
ference between hypergraph and graph is that edges in hypergraph, called the 
hyperedges, may contain more than two vertices as their endpoints. The degree of 
a hyperedge is defined as the number of endpoints contained in it. By replacing 
edges in the definition of SNDP with hyperedges, we get the definition of SND- 
PHG. Thus the SNDP is a special case of the SNDPHG in which the degrees 
of all the hyperedges are 2. We note that the ECP is also a special case of the 
SNDPHG. To see this, consider a non-terminal w. Let {rii, w}, . . ., w} be 
the edges that are incident to w. For each i = 1, . . ,,k, replace edge {vi,w} with 
two edges {vi,Wi} and {wi,w}, introducing a new terminal Wi. Let the cost of 
edge {vi,Wi} be the same as {vi,w}. Let = 0 if at least one of s and t is a 
new terminal. Then replace w and all the edges {{rui, rc}|i = l,...,fc} with a 
hyperedge of zero cost = {wi, . . .,Wk}. Repeat this process until there is no 
non-terminal left. Clearly in this way we can reduce the ECP to the SNDPHG 
in linear time. In fact, let dmax denote the maximum degree of the hyperedges 
with positive cost, we have shown that the ECP is a special case of the SND- 
PHG in which dmax = 2. Furthermore, we notice that the SNDPHG can model 
more general network design problems, e.g., it is easy to see that the problem of 
multicasting in a network involving router cost can be modeled by an SNDPHG 
in which routers are modeled by hyperedges. 

Clearly, the general SNDPHG is also NP-hard even for unit cost and r^ax = 
1. In [10] Takeshita et al. extend the primal-dual approximation algorithm of [5] 
to the SNDPHG in which r^ax = 1- They show a fc-approximation algorithm, 
where k is the maximum degree of hyperedges. In this paper we design an approx- 
imation algorithm to the SNDPHG based on the primal-dual schema established 
in [11], [4]. As a result, we can get a performance guarantee of dmax’H(?’max)- Our 
result includes (or improves) the former results of [11], [4], [8] in which dmax = 2, 
and [10] in which r^ax = 1- Like the algorithms for the SNDP in graphs, our 
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algorithm is also applicable to more general problems, provided that the input 
satisfies two conditions (see Conditions 1 and 2 in the next two sections) . 

We present the algorithm for problems satisfying Conditions 1 and 2 in 
Sect. 3. In Sect. 4 we give a proof of the performance guarantee. We then show 
in Sect. 5 that the SNDPHG satisfies the two conditions. 



2 Preliminaries 

All (hyper)graphs treated in this paper are undirected unless stated otherwise. 
Directed graphs are noted as digraphs. Let G be a (hyper)graph, and V (G) and 
E{G) denote the vertex set and (hyper)edge set of G, respectively. A {hyper) edge 
e with end points vi, . . .,Vk is denoted hy e = {vi, ... ,Vk} and it may be treated 
as the set {ui, . . . ,Vk} of the endpoints. For an S' C V{G), the subgraph of G 
induced by S is denoted by G[S] (i.e., G[S] = (S, E{G) C 2'®)). The neighbors of 
S is denoted by E{S), i.e., E{S) A {u G V{G) - S| 3e G E{G), u G e, eC S yf 0}. 
The set of (hyper)edges incident to S is denoted by <i(S), i.e., 5{S) = {e G E{G) \ 
0 yf eCS yf e}. Let 5a{S) = An5(S) for an A C E{G) (in particular Se(g) = ^)- 
It is well known that for a fixed A is submodular, i.e., 

|<Ia(A)| + |5^(y)| > |<Ia(A n r)| + \Sa{X U Y)\ for any C V{G). (1) 

Since |(Ia| is symmetric (i.e., |<5yi(A)| = \5a{V{G) — A)| for any X C C(G)), 

I 5a (A) I + |5^(F)| > |5a(A - r)| + |5 a(F - A)| for any A, F C V{G). (2) 

In this paper we treat the following problem. Given a hypergraph iJ with 
nonnegative hyperedge costs, and a requirement function r : ^ Z+ (as 

an oracle to evaluate r{X) for any given X C V{E[)), find a minimum cost 
hyperedge set E* C E{E[) such that |5 b*(S)| > r{S) for all S C V{E[). The 
problem can be converted into the next equivalent problem. 

Definition 1 (Problem V). Given a bipartite graph G = {T,W,E), where T 
and W are two disjoint vertex sets and E is a set of edges between T and W, 
where vertices in T and W are called terminals and non-terminals, respectively. 
Let c : IF — > R+ be a cost function, and r : 2^ ^ Z+ be a requirement function. 
Find a minimum cost IF* C IF such that |T(S) C F{T — S) C IF*| > r(S) for 
all S C T. { Without loss of generality we assume that r(0) = r(T) = 0 and 
J'max = max{r(S) | S C T} < |IF|, otherwise there is no feasible solution.) 

The equivalence can be seen easily as following. Let T = V{H). Replace each 
hyperedge e = {vi, . . .,Vk} with a new non-terminal vertex We and k edges 
{vi,We}, . . ., {vk,We}. Assign We the same cost as the hyperedge e. Notice that 
e G S{S) in H if and only if We G F{S) C E{T — S) in G. 

In what follows, we will consider the problem V instead of the original form 
of the problem. Define A{S) = E{S) C E{T — S') C IF for S' C T (here and 
in what follows, notations E and A are defined in the input bipartite graph G 
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unless otherwise stated). Problem V can be written as the next integer program. 

(IP) minX)cu,x„ 

s.t. x{A{S)) > r{S) for all SCT, 

Xw € {0, 1} for all w G W, 

where x(Z\(S')) = Xw We assume that r satisfies two conditions. 

Condition 1. r is weakly supermodular, i.e., r(T) = 0, and for any X,Y CT 

r{X) + r{Y) < max{r(X n F) -h r{X U Y),r{X -Y) + r{Y - X)}. (3) 

(The second condition is stated in Sect. 3.) Let Aa{S) = An ^(5”). Notice that 
|Z\| defined in G equals to |5| in H. Thus \Aa\ (for a fixed A C W) is also a 
symmetric submodular function, from which for any A GIW and X,Y C T, 

\Aa{X)\ + \Aa{Y)\ > |Z\^(X n F)| + |Z\^(X U F)|, (4) 

\Aa{X)\ + |Z\^(F)| > |Z\^(X - F)| + |Z\^(F - X)|. (5) 

3 The Primal-Dual Approximation Algorithm for (IP) 

For an S' C T and A C W, the deficiency of S with respect to A is defined as 
r(S) — |Z\/i(S)|. Notice that A is feasible if and only if the maximum deficiency 
over all S C T is non-positive. Similarly to [4] and [11], our algorithm contains 
rmax phases. Let Wq = 0 and Wi-i C IF be the non-terminal set picked after 
phase i — 1. At the beginning of phase i, the maximum deficiency (with respect 
to Wi-i) is rmax — i + 1. We decrease the maximum deficiency by 1 in phase 
i, by solving an augmenting problem (IP)i. An Ai C W — Wi-i that is feasible 
to the augmenting problem is found by a primal-dual approach. We then set 
Wi = Wi-i U Ai and go to the next phase until i = rmax- 
The augmenting problem we want to solve in phase i is 

(IP). min ^ Cn,x^ 

weiv-Wi-i 

s.t. x{Aw-Wi. .^iS))>hfiS) for all S' CT, 

Xw G {0, 1} for all rc G IF — Wi-i, 

where hi(-) is defined as 

if r(S) - |Z\h/,_i(S)| = rmax-i+ 1, ..s 

otherwise (r(S) - |Z\w,_i(S) I < rmax - *) 

(hence we have an oracle to evaluate hi). Clearly the union of lFi_i and any 
feasible solution to (IP)i decreases the maximum deficiency by at least 1. (We 
will see that Wi-i U Ai, where Ai is the set found by the primal-dual approach, 
decreases the maximum deficiency exactly by 1.) Thus at the end of phase rmax, 
a feasible solution to (IP) will be found. 

The notation of violated sets is needed by the primal-dual approach for (IP)^. 
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Definition 2 (Violated Set). Let ACW— Wi-i be a non-terminal set. Set 
S C T is said to be violated with respect to A if hi{S) = 1 and Aa{S) = 0. It 
is a minimal violated set if it is a violated set and minimal under set inclusion. 
Let V{A) = {S C T \ S is a minimal violated set of A}. 

It is clear that A is feasible to (IP)i if and only if V(4l) = 0. Under the assumption 
of Condition 1, the violated sets have a nice property as shown in the next lemma. 

Lemma 1. For any ACW — Wi-i, if X,Y C T are two violated sets of A, 
then either X nY,X UY or X — Y,Y — X are violated sets of A, too. 

Proof. By the definition of violated set and hi, we see that r{X) — \Awi_i (V) | = 
r{Y) - \Awi_i(Y) \ = rmax-*+ 1 and Aa{X) = Aa(Y) = 0. By (4) and (5), 

|z\a(v n y)| + \Aa{x u r)| < \Aa{x)\ + \Aa{y)\ = o, 

|Z\^(V -Y)\ + \Aa{Y -X)\< \Aa{X)\ + \Aa{Y)\ = 0. 

Hence Aa{X n F) = Aa{X U F) = Aa{X - F) = Aa(Y - X) = 0. On the 
other hand, r is weakly supermodular by Condition 1, \ Awi_i \ is symmetric and 
submodular. Thus r — |Z\viy_i| is also weakly supermodular. Hence 

{r{X CY) - \Aw,_dX CY)\) + {r{X UY) - \Aw,_AX UY)\) 

> (r(X) - \Aw,_AX)\) + (r(F) - \Aw,_Ay)\) = 2(w - i + 1) 
or 

(r(X - F) - - F)|) + (r(F - X) - - ^)|) 

> (r(X) - \Aw,_AX)\) + {r{Y) - |Z\w,_i(F)|) = 2(w - i + 1) 

holds. Since rmax — i + 1 is the maximum deficiency at the beginning of phase 
i, it holds that r{S) — |Z\wi_i (-S)! < rmax — t + 1 for all S' C T. Thus we have 
r(S)-|Z\w,_i(S)| =rmax-f+l for Sg {XnF,XUF} or S G {X - F, F- X}. 
Hence either XCF, XUF orX — F, F — X are violated sets of A. □ 

Two sets X, F are said to intersect if XCF y^0, X — F y^0 and F — X 0. 
An immediate corollary from Lemma 1 is 

Corollary 1. Let ACW — Wi-\. Let X and Y be a minimal violated set and a 
violated set of A, respectively. Then X and Y do not intersect, either X CY or 
X n F = 0. Especially, ifY is also a minimal violated set then X n F = 0. □ 

We now state another condition which needs to be satisfied by r. 
Condition 2. V{A) can be computed in polynomial time for any A C W—Wi-i. 
Relax Xw G {0, 1} to Xy, > 0. The dual of this relaxation of (IP)i is 

(D). max ^hi(S)ys 

SCT 

s.t. ^ ys ^ Cw for all ic G IF — Wi_i, 

SCT:weA{S) 

2/ > 0 . 
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We consider the next algorithm to (IP)i according to the primal-dual schema 
outlined in [4] and [11]. We use c, A, y and j to denote the reduced cost, primal 
solution, dual variable and number of iterations, respectively. 



1 

2 

3 

4 

5 

6 

7 

8 

9 

10 
11 
12 



c ^ c, A ^ 0, 2 /^ 0 , j ^ 0 

WHILE A is not feasible 
J ^ J + 1 

Vj ^ the minimal violated sets V{A) 

IF exists S G Vj such that Aw-Wi-i-A{S) = 0 THEN 
Halt. (IP) has no feasible solution. 

Wj ^ argmin{ |^ggy.|(jg^(g^^l \w&W- - A} 

^ |{Sev,|l7e^(g)}| ’ ys^ys + for all S G 
<— Cu, — |{,5 G Vj\w G Z\(5')}|ej for all ic G W — W^-i — A 
A <— A U {ruj} 

FOR I = j DOWN TO 1 

IF A — {ru;} is feasible THEN A ^ A — {ru;} 

Output A (as Ai). 



Clearly Ai and y are feasible to (IP)i and (D)j, respectively. Step 1 takes 0(|IT|) 
time (only those positive ys are stored). There are at most \W — Wi_ i| < jlFj 
WHILE iterations since |A| increases by 1 after one iteration. Let 6 denote 
the time complexity to compute V(A). Notice that steps 4 and 10-11 can be 
done in 6 time since A is feasible if and only if V(A) = 0. By Corollary 1, 
we see |V(A)| < |T|. Thus step 6 can be done in 0(|T||IT|) time and this 
dominates other steps. Hence the algorithm for (IP)i takes 0{\W\{9 + |r||IE|)) 
time. Therefore the algorithm for (IP) can be done in 0(rmax|IP|(^^ + |T||fP|)) 
time. Since rmax < |IP|, this is polynomial. 



4 Proof of Performance Guarantee 

Lemma 2. Let Ai and y be the output and the corresponding dual variable ob- 
tained at the end of the primal-dual algorithm for {IP)i, respectively. Then 

Cw Ad 

max E hi{S)ys- □ 

w^Ai SCT 

Before proving Lemma 2, we show that it implies the claimed guarantee. 

Theorem 1. Let optip denote the optimal value of (IP). Let ITrmax = Ui=r 
be the output of our r^g^x-phases algorithm for {IP). Then 

E Cw Ad max n{ ^ max )optip. (7) 

Proof Relax G {0, 1} to 0 < < 1. The dual of this relaxation of (IP) is 

(D) max E - E 

SCT wCW 
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s.t. ^ 2/S < Cu, + Zuj for all w e LL 

SCT:weA(S) 

y >0, z >0. 

Let opti) denote the optimal value of (D). Notice that optB < optip by the weak 
duality. Fix i. Let y be the dual variable of (D)^ obtained in phase i. Let 

_ f J2sCT:w&A{S)yS if W G Wi-i, 

“ \ 0 otherwise {w G W — Wi-\). 

It is easy to see that (y, z) is a feasible solution to (D). Thus we have 

opt-D > X ! = r{s)ys - 'll ys 

weWi-i S:wGA{S) 

= (r-max - i + 1) X f^i(S)yS- 

The last equality holds because ys remains to be 0 for all S with hi{S) = 0, and 
hi{S) = 1 if and only if r(S') — |Z\wi_i (*S')| = ^max — i + 1- By Lemma 2 we have 

X < dma.^'^hi{S)ys < optp, 

“ ^ r-max - * + 1 

wGAi 

^ max 

X = X X X d max 'H{r^^^)optp, < d 

max H(rmax)op/lp. □ 

* = 1 

Thus we only need to prove Lemma 2 to show the performance guarantee. 

Proof of Lemma 2. First suppose that > 0 for all w G W. Then dmax is the 
maximum degree of non-terminals. Let L be the number of WHILE iterations. 
Notice that ^ Vj\wi G Z\(S')}|ej for I = 1, 2, . . ., L. Thus 

X = X X ^ ^ ^('S')}|fo = X X AAi{S)\ej. 

weAi weAi j j SeVj 

On the other hand, since ys = X)j-seVj 0> have 

X di{S)ys = X = X X fo = X Ifo- 

SCT SCT SCTj-.SGVj j 

Thus we only need to show that for all j G {1, . . ., L}, it holds 

^ |Z\^.(5)|<d„,ax|V,|. (8) 

seVj 

Fix j, consider A = {wi, . . ., Wj-i}. By the reverse delete step 10-11, we see that 
B = AU Ai is a minimal augmentation of A, i.e., A C B C W — Wi-i and B 
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is feasible to (IP)i but the removal of any w G B — A will violate the feasibility. 
Thus if we can show that for any infeasible non-terminal set A C W — Wi-i and 
any minimal augmentation B of A, it holds 

^ \Ab{S)\ < d^,^\V{A)\, (9) 

sev(A) 

then (8) holds (by letting A = {wi, . . B = ALI notice |Z\yi;(S')| < 

|Z\s(5')|), implying Lemma 2. In the following we show (9). For this, a notation 
of witness sets is used. Let U = Us6V(A) ^b{S) C B — A. 

Definition 3 (Witness Set). Cw C T is a witness set of w G U if it satisfies 
(i) hi{Cyj) = 1, and (ii) Ab{Cw) = {w}. 

By (i) and (ii), we see that is a violated set of A (notice that w ^ A). For 
any w G U, there must exist a witness set of w since the removal of w violates 
the feasibility of B {B is a minimal augmentation of A). Call {Cw\w G U} a 
witness set family, in which only one Cw is included for each w G U. 

Lemma 3. There exists a laminar {i.e., intersect- free) witness set family. 

Proof. Given a witness set family we construct a laminar one. 

Suppose that two witness sets Cy and intersect. Since Cy and Cw are 
violated sets of A we see that either CyllCyj, C„ U Cw or Cy — Cy,, Cy, — Cy are 
violated sets of A (Lemma 1). 

Suppose that Cy n Cw and Cy U Cy, are violated sets. By the definition of 
violated set they must satisfy (i). B is feasible implies that |Z\B(C'^nC'„)| > 1 and 
\AB{CyUCy,)\ > 1. However, by (4) we see that \AB{CyACy,)\-\-\AB{CyUCy,)\ < 
\AB{Cy)\ + \AB{Cy,)\ = 2. Therefore \AB{CynCy,)\ = |Z\s(auC„)| = 1 holds. 
It is easy to see that {w,ic} C AB{Cy n Cy,) U AB{Cy U Cy,), which shows that 
AB{Cy n Cy,) = {a} and AB{Cy U Cy,) = {b} hold for {a,b} = {v,w}. Thus we 
can replace Cy and Cy, by Cy D Cy, and Cy U Cy, in the witness family. 

Similarly, if Cy — Cy, and Cy, — Cy are violated sets of A, then we can replace 
Cy, Cy, by Cy ~ Cy, and Cy, — Cy. In both cases this un-intersecting process will 
decrease the total number of pairs of intersected sets in the witness family. Thus 
after a finite number of this process, a laminar witness set family is obtained. □ 

Let IF be the union of {T} and the laminar witness set family. Construct 
a rooted tree T by set inclusion relationship as follows. T contains \F\ nodes: 
uc for C G F , the root is ut, and for each C G F , the parent of uc is the 
node uc for the minimum C G F such that C GL C . For each S G V{A), let 
u{S) = uc for the minimum C G F such that S C C. Let a{uc) = |{*S' G V{A) 

I u{S) = uc}\ for Uc G T. Let Ty = {uc G T | a(uc) > 1}. Clearly we have 

|V(A)| = ^ a(uc)= ^ a(uc). (10) 

uceT UC&'Ta 

Let dr{uc) denote the degree of node uc in tree T. For a node in tree T, 
Cy, is a witness set and is a violated set, implying that it must include some 
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S G V{A). Thus if uc„ is a leaf, then = u{S). Hence all the degree 1 nodes 
except for the root ut (if its degree is 1) must be in 7(i. This observation shows 
that X^uceT-T dr{uc) > 2(|T| — \Ta\) — 1. Since T is a tree, we have that 
Y.uc^rdr{uc) = 2{\T\-l).T\ms 

driuc) = driuc) - driuc) 

uc^T uc^'T—'Ta 

< 2(|T| - 1) - (2(|T| - \%\) - 1) = 2|T,| - 1. (11) 

We will show that for each uc &Ta, 

|Z\b(S')| < min{di„ax - l,a('uc)}dr('uc)- (12) 

SeV(A):«(S)=«c 

Consider an S' G V{A) and a w G Ab{S). Let Cw G IF be the witness set of w. 
Since is a violated set, either S C Cw or S C = 0 holds by Corollary 1. 

Case 1: S C Cw Since w G Ab(S), there exists an s G C(w) C S. By (ii), 
there exists a t G C(w) H (T — Cw)- Thus u{S) = Cw Let uc be the parent 
of uc„ in T (it exists since c^, ^ T). Then we see that r{w) C C (otherwise 
w G Ab{C), which implies that C ^ T and C must be a witness set such that 
Ab{C) = {ic} contradicting that Ab{Cw) = {ic} for yf C). We use a directed 
edge (uc„,uc) to represent this case. The directed edge (uc^,uc) may not be 
unique since there may exists some other S' G V{A) such that w G Ab{S') 
and u{S') = uc^- In such cases multiple directed edges (uc^,uc) are allowed, 
but for each S' of such sets {w G Ab(S') and u{S') = mc„) only one edge 
(uc^,uc) is used. Notice that such sets are disjoint (Corollary 1). It is then 
easy to see that the total number of such directed edges (uc^,uc) is at most 
min{ |T'(r(;) I — 1, a(uc„)} < min{(ii„ax — 1, <a(wc„)} (notice that t G r{w) — Cw)- 
Case 2: SnCw = 0. There must exist an s G r{w)r]S and at G r{w)r]Cuj- Let 
uc be the parent of uc^- We see that r{w) C C, hence S C C and ^(5*) = C. We 
use a directed edge {uc, ucS) to represent this case. Similarly as in the previous 
case, the total number of these {uc, uc„) edges is at most min{dmax— 1, oi{uc)}- 
For each uc G 7a, the two cases may happen simultaneously. But we have 
seen that for one edge {uc,uc} in T, there are at most min{(imax — ^,ct{uc)} 
directed edges {uc, uc) that are produced in Case 1 or 2. Thus the total number 
of the directed edges {uc,-) is at most min{dniax — ^,C({uc)}dT{uc)- On the 
other hand, the way that the directed edges are produced ensures that the total 
number of the directed edges {uc, •) (over all S G V{A) and w G Ab{S)) equals 
to Xsev(A):«(S)=«c l^s(*S')|. Hence (12) has been shown. Thus 

y] iz\b(5)I = E E 

Sev(.4) uc&T SeV(A)-.u(S)=uc 

< E min{di„ax - 1, a{uc)}dr{uc)- 

To show (9), it suffices to show by (10) that 

y] min{d^ax-l,a(Mc)}dr(Mc) <dmax y] a{uc). 

Uc^Ta 



( 13 ) 
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Let X = {uc G Ta\a{uc) > (imax - 1}, y = {uc G Ta\a{uc) = 1} and Z = 
Ta — X — Y. The left hand side of (13) is at most 

(dmax ~ 1) E driuc) + E driuc) + {d max 2) E driuc)- 

ucex uceY ucez 

The right hand side of (13) is at least dmax((dmax — 1)|-^| + |^| + 2|Z|). Then 
by (11) and |7)j| = |X| -|- |y| -|- jZj, it is not difficult to verify (13). 

Thus we have proved Lemma 2 if c^, > 0 for all ru G IT. It is not difficult to 
see that it is also true in the general cases. To see this, notice that we only need to 
show (8) for j with ej > 0, which implies that > 0 for all w G Usev^ 

Thus |T(w)| < dmax for all w G Usev^ the proof goes. □ 

5 Survivable Network Design Problem in Hypergraph 

It is equivalent to the next problem defined in a bipartite graph G = (T, W, E) . 
Given c : W ^ R+ and Vgt G Z+ for each pair of distinct terminals s,t G T, find 
a minimum cost W* C W such that, for each pair of s and t, G[T U W*] has 
at least r^t paths which are IT-disjoint (i.e., no ic G IT belongs to two or more 
paths). We show that it is equivalent to problem V (Sect. 2) with r defined as 
r{S) = max{rst \ s G S, t G T - S} for all S C T (r(0) = r(T) = 0). 

A useful idea when considering IT-disjoint paths in G is a transformation 
from G to a digraph Z? in the following way. 

Definition 4 (Transformation T>). Replace each undirected edge e = {v,^} 
by two directed edges (v,w) and (w,v). Then for each non-terminal w make a 
copy named w‘^, and change the tails of all directed edges {w, v) from w to w^, 
then add a new directed edge {w,w‘^). Let the capacity of directed edge (w,w'^) 
be 1 for all non-terminal w, and of others directed edges be -l-oo. 

Notice that for any pair of terminals s and t, any k IT-disjoint s,t-paths in G 
are transformed to an integer s,t-fiow of value k in , and vice versa. Thus 
a non-terminal set W C IT is feasible if and only if the maximum s,t-fiow in 
G[T U IT'] has value at least rst for each pair of terminals s and t. By the well 
known maxfiow-mincut theorem [1], this equals that in G[T U W'] any s,t-cut 
has capacity at least rst for each pair of terminals s and t. It is not difficult to 
see that this is equivalent to |Z\w'(>5')| > r(S) for all S CT. Thus the SNDPHG 
is equivalent to problem V with r defined as r{S) = max{rst | s G S', t G T— S}. 

It is easy to see that r is a weakly supermodular function, satisfying Gondition 
1 . We show that the minimum violated sets can be computed in polynomial time 
(Gondition 2), which means that our algorithm in Sect. 3 works for the SNDPHG. 



Lemma 4. Denote lT_iUA by A. If S is a minimal violated set of A, then there 



exist vertices s G S and t G T — S such that in digraph G[T U A], S = Gst H T 
for any minimum s,t-cut Gst that is minimal under set inclusion. 
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Proof. By definition of r{S), there exist two terminals s G S and t G T — S such 
that Tst = r{S). Let C = S' U P^{S) U {w‘^ \ w G r^{S) — Z\^(S)}. It is clear 
that S = C nT. We show that C = Cst, which implies the claimed statement. 
We first show that C is a minimum s,t-cut. 

The capacity of C is exactly |Z\^(S)|. Let S' = Cst H T. It is easy to see 
that the capacity of Cst is at least |Z\^(S')|. We show that |Z\^(S)| < |Z\^(S')|, 
which implies that (7 is a minimum s,t-cut. Notice that S is a violated set of A, 
implying hi{S) = 1 and Aa{S) = 0. Thus r(S) — \Awi_i (S)| = rmax — * + 1 and 
Aa{S) = 0. Hence 

|Z\^(S)| = \Aw,_AS)\ + |7\a(S)| = Tst - w + i - 1. (14) 

On the other hand, if S' is also a violated set of A, then similarly 

l^d('S'OI = - ?'max + i - ^ > rst - rmax + i~l = |Z\^(S)|. (15) 

(Notice that s G S' and t G T — S' .) Otherwise S' is not a violated set of A, and 
hence either hi{S') = 1, Aa{S') yf 0 or hi{S') = 0, which implies 

|L\^(S")| > r(S') - rmax + * > r^t - rmax + i > |L\^(5')|. (16) 

Thus (7 is a minimum s,t-cut. Since Cst is also a minimum s,t-cut, we have 
|Z\^(S')| = |Z\^(S")|. The proof above shows that S' = Cst H T must be a 
violated set of A. Thus by Corollary 1 we have S C S' = Cst C T. It is then easy 
to see that C C Cst- Thus C = Cst by the assumption that Cst is minimal. □ 

By Lemma 4, we can identify the minimal violated sets by computing a minimal 
minimum s,t-cut in G[T U A] for all pairs of terminals s and t and checking 
if they are violated and minimal among these cuts. It is well known that for 
each pair of s and t, there is only one minimal minimum s,t-cut that can be 
found by one maxfiow computation in 0{rA) time for a digraph with n vertices 
([3]). Thus the total running time to find minimal violated sets is dominated 
by 0(|Tp) maxfiow computations. Thus our algorithm for the SNDPHG can be 
implemented to run in 0(rmax|kL| |Tp(|r| + 2|W|)^) time. We summary this as 
the next theorem. 

Theorem 2. Let dmax be the maximum degree of hyperedges with positive cost, 
rmax be the maximum requirement, then the SNDPHG can be approximated 
within factor dma.yfhl{r^a;,f) in 0{r^i,x'mn‘^{n + 2m)^) time, where m,n is the 
number of hyperedges, vertices respectively. □ 

6 Conclusion 

In this paper, we have shown that the SNDPHG can be approximated by a factor 
dmax'W(rmax) in polynomial time. We note that the performance guarantee dmax 
to the primal-dual algorithm for (IP)i (Lemma 2) is tight (a tight example will 
be given in the full version of this paper). Notice that in [4] Goemans et al. have 
shown that for the SNDP in graphs the performance guarantee 2'W(rmax) is tight 
up to a constant factor. It is thus interesting to know whether an algorithm 
with improved (e.g., constant) performance guarantee can be developed via an 
iterative rounding process as used in [7] for the SNDP. 
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Abstract. Despite of the well-known state-explosion problem, certain 
simple but important data-flow analysis problems known as gen/kill 
problems can be solved efficiently and completely for parallel programs 
with a shared state [7,6,2,3,13]. This paper shows that, in all probabil- 
ity, these surprising results cannot be generalized to significantly larger 
classes of data-flow analysis problems. 

More specifically, we study the complexity of detecting copy constants 
in parallel programs, a problem that may be seen as representing the 
next level of difficulty of data-flow problems beyond gen/kill problems. 
We show that already the intraprocedural problem for loop-free parallel 
programs is co-NP-complete and that the interprocedural problem is 
even PSPACE-hard. 



1 Introduction 

A well-known obstacle for the automatic analysis of parallel programs is the 
so-called state- explosion problem: the number of (control) states of a parallel 
program grows exponentially with the number of parallel components. It comes 
as a surprise that certain basic but important data-flow analysis problems can 
nevertheless be solved completely and efficiently for programs with a fork/join 
kind of parallelism. 

Knoop, Steffen, and Vollmer [7] show that hitvector analyses, which comprise, 
e.g., live/dead variable analysis, available expression analysis, and reaching defi- 
nition analysis [8], can efficiently be performed on such programs. Knoop shows 
in [6] that a simple variant of constant detection, that of so-called strong con- 
stants, is tractable as well. These articles restrict attention to the intraprocedural 
problem, in which each procedure body is analyzed separately with worst-case 
assumption on called procedures. Seidl and Steffen [13] generalize these results 
to the interprocedural case in which the interplay between procedures is taken 
into account and to a slightly more extensive class of data-flow problems called 
gen/kill problems^. All these papers extend the flxpoint computation technique 

^ Gen/kill problems are characterized by the fact that all transfer functions are of the 
form Xx.{x A a) V 6, where a, b are constants from the underlying lattice of data-flow 
facts. 
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common in data-flow analysis to parallel programs. Another line of research ap- 
plies automata-theoretic techniques that were originally developed for the ver- 
ification of PA-processes, a certain class of infinite-state processes combining 
sequentiality and parallelism. Specifically, Esparza, Knoop, and Podelski [2,3] 
demonstrate how live variables analysis can be done and indicate that other 
bitvector analyses can be approached in a similar fashion. 

Can these results be generalized further to considerably richer classes of data- 
flow problems? The current paper shows that this is very unlikely. We investigate 
the complexity of detection of copy constants, a problem that may be seen as 
a canonic representative of the next level of difficulty of data-flow problems be- 
yond gen/kill problems. In the sequential setting the problem gives rise to a 
distributive data-flow framework on a lattice with small chain height and can 
thus - by the classic result of Kildall [5,8] - completely and efficiently be solved 
by a fixpoint computation. We show in this paper that copy constant detection 
is co-NP-complete already for loop-free parallel programs without procedures 
and becomes even PSPACE-hard if one allows loops and non-recursive proce- 
dures. This renders the possibility of complete and efficient data-flow analysis 
algorithms for parallel programs for more extensive classes of analyses unlikely, 
as it is generally believed that the inclusions P C co-NP C PSPACE are proper. 

Our theorems should be contrasted with complexity and undecidability re- 
sults of Taylor [14] and Ramalingam [11] who consider synchronization- dependent 
data-flow analyses of parallel programs, i.e. analyses that are precise with respect 
to the synchronization structure of programs. Taylor and Ramalingam largely 
exploit the strength of rendezvous style synchronization, while we exploit only 
interference and no kind of synchronization. Our results thus point to a more 
fundamental limitation in data-flow analysis of parallel programs. 

This paper is organized as follows: In Sect. 2 we give some background in- 
formation on data-flow analysis in general and the constant detection problem 
in particular. In Sect. 3 we introduce loop-free parallel programs. This sets the 
stage for the co-NP-completeness result for the loop-free intraprocedural par- 
allel case which is proved afterwards. We proceed by enriching the considered 
programming language with loops and procedures in Sect. 4. We then show that 
the interprocedural parallel problem is PSPACE-hard even if we allow only non- 
recursive procedures. In the Conclusions, Sect. 5, we indicate that the presented 
results apply also to some other data-flow analysis problems, detection of may- 
constants and detection of faint code, and discuss directions for future research. 
Throughout this paper we assume that the reader is familiar with the basic 
notions and methods of the theory of computational complexity (see, e.g., [10]). 

2 Copy Constants 

The goal of data-flow analysis is to gather information about certain aspects 
of the behavior of programs by a static analysis. Such information is valuable 
e.g. in optimizing compilers and in CASE tools. However, most questions about 
programs are undecidable. This holds in particular for the question whether a 
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condition in a program may be satisfied or not. In order to come to grips with 
undecidability, it is common in data-fiow analysis to abstract from the conditions 
in the programs and to interpret conditional branching as non-deterministic 
branching, a point of view adopted in this paper. Of course, an analysis based 
on this abstraction considers more program executions than actually possible at 
run-time. One is careful to take this into account when exploiting the results of 
data-fiow analysis. 

An expression e is a constant at a given point p in a program, if e evalu- 
ates to one and the same value whenever control reaches p, i.e. after every run 
from the start of the program to p. If an expression is detected to be a con- 
stant at compile time it can be replaced by its value, a standard transformation 
in optimizing compilers known as constant propagation or constant folding [8] . 
Constant folding is profitable as it decreases both code size and execution time. 
Constancy information is sometimes also useful for eliminating branches of con- 
ditionals that cannot be taken at run-time and for improving the precision of 
other data-fiow analyses. 

Reif and Lewis [12] show by a reduction of Hilbert’s tenth problem that the 
general constant detection problem in sequential programs is undecidable, even if 
branching is interpreted non-deterministically. However, if one restricts the kind 
of expressions allowed on the right hand side of assignment statements appro- 
priately, the problem becomes decidable. (In practice assignments of a different 
form are treated by approximating or worst-case assumptions.) A problem that 
is particularly simple for sequential programs are so-called copy constants. In 
this problem assignment statements take only the simple forms x := c (constant 
assignment) and x := y (copying assignment), where c is a constant and x, y are 
variables. In the remainder of this paper we study the complexity of detecting 
copy constants in parallel programs. 



3 Loop-Free Parallel Programs 

Let us, first of all, set the stage for the parallel loop-free intraprocedural copy 
constant detection problem. We consider loop-free parallel programs given by the 
following abstract grammar, 

7T ::= X := e\ write(x) | skip | tti ; 7T2 | tti || 7T2 | tti □ tt 2 
e ::= c \ x , 

where x ranges over some set of variables and c over some set of basic constants. 
As usual we use parenthesis to disambiguate programs. Note that this language 
has only constant and copying assignments. The specific nature of basic con- 
stants and the value domain in which they are interpreted is immaterial; we 
only need that 0 and 1 are two constants representing different values, which - 
by abuse of notation - are also denoted by 0 and 1. The atomic statements of 
the language are assignment statements x := e that assign the current value of e 
to variable x, ‘do-nothing’-statements skip, and write-statements. The purpose 
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of write-statements in this paper is to mark prominently the program points 
at which we are interested in constancy of a certain variable. The operator ; 
represents sequential composition, || represents parallel composition, and □ non- 
deterministic branching. 

Parallelism is understood in an interleaving fashion; assignments and write- 
statements are assumed to be atomic. A run of a program is a maximal sequence 
of atomic statements that may be executed in this order in an execution of the 
program. The program (x := 1 ; x := y) \\ y := x for example, has the three runs 
{x := 1, x:=y,y := x), {x := 1, y:=x,x := y), and {y := x, x :=l,x := y). 

In order to allow a formal definition of runs, we need some notation. We 
denote the empty sequence by e and the concatenation operator by an infix 
dot. The concatenation operator is lifted to sets of sequences in the obvious 
way: If S', T are two sets of sequences then S - T = {s-t | s e S,t G T}. Let 
r = (ei, . . . , e„) be a sequence and / = {i \, . . . , ik} a subset of positions in r 
such that ii < i 2 < ■ ■ ■ < ik- Then r\I is the sequence , . . . , Cij,). We write |r| 
for the length of r, viz. n. The interleaving of S and T is 

S II T {r I 3Is, It : Is U It = {1, . . . , k|}, Is n It = 0, r|Is G S, r|lT G T} . 
The set of runs of a program can now inductively be defined: 

Runs(a: := e) = {(x := e)} Runs(7Ti ; tt 2 ) = Runs(7Ti) • Runs(7T2) 

Runs(write(x)) = {(write(x))} Runs(7Ti || 7r2) = Runs(7Ti) || Runs(7T2) 

Runs(skip) = {e} Runs(7Ti □ 7r2) = Runs(7Ti) U Runs(7T2) . 

3.1 NP- Completeness of the Loop-Free Intraprocedural Problem 

The remainder of this section is devoted to the proof of the following theorem, 
which shows that complete detection of copy constants is intractable in parallel 
programs, unless P = NP. 

Theorem 1. The problem of detecting copy constants in loop-free parallel pro- 
grams is co-NP-complete. 

Certainly, the problem lies in co-NP: if a variable x is not constant at a 
certain point in the program we can guess two runs that witness two different 
values. As the program has no loops, the length of these runs (and thus the time 
needed to guess them) is at most linear in the size of the program. 

For showing co-NP-hardness we reduce SAT, the most widely known NP- 
complete problem [1,10], to the negation of a copy constant detection problem. 
An instance of SAT is a conjunction ci A . . . A of clauses ci, . . . ,Cfc. Each 
clause is a disjunction of literals] a literal I is either a variable x or a negated 
variable ~^x, where x ranges over some set of variables X. It is straightforward 
to define when a truth assignment T : X ^ M, where B = {tt, fF} is the set of 
truth values, satisfies ci A . . . A Cfc . The SAT problem asks us to decide for each 
instance ci A . . . A Cfc whether there is a satisfying truth assignment or not. 
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Now suppose given a SAT instance ci A . . . A Cfc with k clauses over n vari- 
ables X = {xi, . . . ,Xn}- We write X = {^x\, . . . ,~>Xn} for the set of negated 
variables. From this SAT instance we construct a loop-free parallel program. In 
the program we use k + 1 variables Zq, z\, . . . , Zk- Intuitively, Zi is, for 1 < i < k, 
related to clause c^; zq is an extra variable. 

For each literal I G X U X we define a program tti . Program tti consists of a 
sequential composition of assignments of the form Zi := Zi-i in increasing order 
of i. The assignment Zi := Zi-i is in tt; if and only if the literal I makes clause i 
true. Formally, tti = irf, where 



def 



TTi skip and tt; =' 



7T; ^ ; Zi := Zi-i , if clause Ci contains I 

, if clause Ci does not contain I 



for i = 1, . . . , k. Now, consider the following program tt: 



zo := I ; zi := 0 ; . . . ; Zk ■■= 0 ; 

n TT^xi) II • • • II n TT^xJ] ; 

{zk := 0 n skip) ; write(zfc) . 

Clearly, tt can be constructed from the given SAT instance ci A . . . A Cfc in 
polynomial time or logarithmic space. We show that the variable Zk at the write- 
statement is not a constant if and only if ci A ... A is satisfiable. This proves 
the co-NP-hardness claim. 

First observe that 0 and 1 are the only values Zk can hold at the write- 
statement because all variables are initialized by 0 or 1 and the other assignments 
only copy these values. Clearly, due to the non-deterministic choice just before 
the write-statement, Zk may hold 0 finally. Thus, Zk is a constant at the write- 
statement iff it cannot hold 1 there. Hence, our goal reduces to proving that Zk 
can hold 1 finally if and only if ci A . . . A Cfc is satisfiable. 



“If”: Suppose T : A — > B is a satisfying truth assignment for ci A . . . A Cfc. 
Consider the following run of tt : in each parallel component tYx^ FI ir^xi choose 
the left branch iTxi if T{xi) = tt and the right branch ir^xi otherwise. As T is 
a satisfying truth assignment, there will be, for any i G {1, . . . , k}, at least one 
assignment Zi := Zi-i in one of the chosen branches. We interleave the branches 
now in such a way that the assignment (s) to zi are executed first, followed by 
the assignment(s) to Z 2 etc. This results in a run that copies the initialization 
value 1 of zq to Zk- 



“Only if”: Suppose Zk may hold 1 at the write-statement. As the initialization 
zq '■= 1 is the only statement in which the constant 1 occurs, there must be a run 
in which this value is copied from zq to Zk via a sequence of copy instructions. As 
all copying assignments in tt have the form Zi := Zi-i, the value must be copied 
from Zq to zi, from zi to Z 2 etc. Consequently, the non-deterministic choices in 
the parallel components can be resolved in such a way that the chosen branches 
contain all the assignments Zi := Zi-i for i = 1, . . . , fc. From such a choice a 
satisfying truth assignment can easily be constructed. 
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4 Adding Loops and Procedures 

Let us now consider a richer program class: programs with procedures, par- 
allelism and loops. A procedural parallel program comprises a finite set Proc of 
procedure names containing a distinguished name Main. Each procedure name P 
is associated with a statement Tip, the corresponding procedure body, constructed 
according to the grammar 

e ::= c | x 

7T ::= X := e\ write(x) | skip | Q | tti ; 7T2 | tti || 7T2 | tti □ 7T2 | tt* , 

where Q ranges over Proc. A statement of the form Q represents a call to pro- 
cedure Q and tt* stands for a loop that iterates tt an indefinite number of 
times. Such an indefinite looping construct is consistent with the abstraction 
that branching is non-deterministic. A program is non-recursive if there is an 
order on the procedure names such that in the body of each procedure only 
procedures with a strictly smaller name are called. 

The definition of runs from the previous section can easily be extended to 
the enriched language by the following two clauses:^ 

Runs(7T*) = Runs(7r)* Runs(P) = Runs(7Tp) . 

As usual, we define X* = lJi>o"^*> where A° = {e} and = A • A® for a 
set A of sequences. The runs of the program are the runs of Main. 

4.1 PSPACE-Hardness of Inter procedural Copy Constant Detection 

The goal of this section is to prove the following result. 

Theorem 2. The problem of detecting copy constants in non-recursive procedu- 
ral parallel programs is PSPACE-hard. 

The proof is by means of a reduction of the QBE (quantified Boolean formulas) 
problem to copy constant detection. QBE (called QSAT in [10]) is a well-known 
PSPACE-complete problem. 



Quantified Boolean Formulas. Let us first recall QBE. A QBE instance is a 
quantified Boolean formula. 



4> = QnXn ■ ■ ■■ Vx2 : 3xi : Cl A . . . A Cfc , 



where Q„ is the quantifier 3 if n is odd and V if n is even, i.e. quantifiers are 
strictly alternating. 

^ If the program has recursive procedures, the definition of runs is no longer inductive. 
Then the clauses are meant to specify the smallest sets obeying the given equations, 
which exist by the well-known Knaster- Tarski fixpoint theorem. However, only non- 
recursive programs occur in this paper. 
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As in SAT, each clause Ci is a disjunction of literals, where a literal I is either a 
variable from X = {x\, . . . , x„} or a negated variable from X = {^xi, . . . , ~^Xn}- 
The set of indices of clauses made true by literal I is Cl(/) {i G {1, . . . , A:} | 
Ci contains /}. For later reference the following names are introduced for the 
sub-formulas of (f): 



4>o = Cl A . . . A Cfc and (f>i = QiXi : (pi-i for 1 < t < n , 

where again Qi is 3 if t is odd and V if i is even. Clearly, <j) is just </>„. 

Formula 4>i is assigned a truth value with respect to a truth assignment 
T G TAi {T I T : {xj+i, . . . , x„} ^ B}. We write T[x i— > b] for the truth 
assignment that maps cc to 6 G B and behaves otherwise like T. We use this 
notation only if x is not already in the domain of T. For a truth assignment T 
we denote by CI(T) the set of indices of clauses that are made true by T: 

CI(T)=‘' (J CI(cc)U U Cl(-x). 

x: T(x)— tt x: T{x)—ff 

Note that C\{T[x tt]) = CI(T) U Cl(x) and CI(T[x fF]) = CI(T) U Cl(^x) 
(recall that x is not in the domain of T). Note also that TA„ contains only the 
trivial truth assignment 0 for which Cl(0) = 0. 

Using this notation, the truth value of a formula with respect to a truth 
assignment can be defined as follows: 

Th0o iff CI(T) = {!,..., fc} 

T U A- 'ff / ^ ^ ^ 

^ ^ ( T[xi I— > tt] 1= 4>i-i and T[xi ff] ^ ; if i is even {Qi = V) 



The Reduction. From a QBF instance as above, we construct a program, in 
which we again use fc -I- 1 variables zq, Zi, . . . , Zk in a similar way as in Sect. 3. 
Let the programs tt; be defined as in that section. 

Let Proc = {Main, Pq, Pi, ■ ■ ■ , Pn} be the set of procedures. The associated 
statements are defined as follows: 



'^Main 


def 


Zo ■■=1; Zi: 


= 0;...; 


Zk ■ = 


= 0; 




def 


zi :=0 ; 


o 

II 






^Pq 


= 


Zo ■■= 


= Zk 


'^Pi 


def 


I (K, II P^- 


i) n 


\\P^- 


-i). 




X II P^- 


i) ; 


\\P^- 


i) ) 



if i is even 



for 1 < t < n . 



Clearly, this program can be constructed from the QBF instance in polynomial 
time or logarithmic space. Note that the introduction of procedures is essential 
for this to be the case. While we could easily construct an equivalent program 
without procedures by inlining the procedures, i.e. by successively replacing each 
call to procedure Pj by its body, for j = 0, . . . , n, the size of the resulting program 
would in general be exponential in n, as each procedure Pj is called twice in Pj+i ■ 
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Therefore, we need the procedures to write this program succinctly and to obtain 
a logspace-reduction. 

We show in the following, that the variable Zq is not a constant at the write- 
statement in procedure Main if and only if the QBF instance is true. This 
establishes the PSPACE-hardness claim. ^ 

Observe again that zq can hold only the values 0 and 1 at the write-statement 
because all variables are initialized by these values and the other assignments 
only copy them. Clearly, due to the non-deterministic choice just before the 
write-statement, it can hold 0. Thus, Zq is a constant at the write-statement iff 
it cannot hold 1 there. Hence we can rephrase our proof goal as follows: 

zo can hold the value 1 at the write-statement in iTMain /'pr’l 

if and only if </> is true. ' 

In the remainder of this section we separately prove the ‘if’ and the ‘only if’ 
direction. 



The “If” Direction. For the ‘if’ claim, we show that procedure has a run 
of a special form called a copy chain, if (p is true. 

Definition 3. A (total) segment is a sequence of assignment statements of 
the form {zi := 0, . . . , Zk ■■= 0, (zi := zq)”^ , . . . , (z^ := Zfc_i)”'“ , zq := Zfc) , where 
rii > ^ for i = 1, . . . ,n. A (total) copy chain is a concatenation of segments. 

Every segment copies the initial value of zq back to zq via the sub-chain of assign- 
ments Zi := Zo, Z 2 := zi, . . . , Zk ■= Zk-i, zq := Zfc, where each Zi := Zi-i is the 
last assignment in the block (zj := Zi_i)"v Note that the other statements in a 
segment do not kill this value; in particular the assignments (zi := 0, . . . , Zfc := 0) 
do not affect zq. By induction on the number of segments, a total copy chain 
copies the initial value of zq back to Zq too. Thus, if P„ has a run that is a 
total copy chain, then Zq can, at the write-statement in iTMain, hold the value 
1 by which it was initialized. As a consequence the following lemma implies the 
‘if ’-direction of (PG). 

Lemma 4. If <j) is true, then Pn has a run that is a total copy chain. 

In order to enable an inductive proof of this lemma we consider partial copy 
chains in which some of the blocks (zj := Zi_i)"* may be missing (i.e. may 
be zero). 

Definition 5. A partial segment is a sequence of assignment statments of the 
form s = (zi := 0, . . . ,Zfc := 0, (zi := Zo)”E . . . , (zfc := Zfc_i)”'=,zo := Zk), where 
now Ui > 0 for i = 1, . . . ,n. For H C {1 , . . . ,k} we say that s is a partial 
segment with holes in H if H A {i \ Ui = 0} . A partial copy chain with holes in 
H is a concatenation of partial segments with holes in H. 

^ Recall that PSPACE coincides with co-PSPACE because PSPACE is closed under 
complement. 
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Intuitively, the holes in a partial copy chain may be filled by programs running 
in parallel to form a total copy chain. Note that a partial copy chain with holes 
in iJ = 0 is a total copy chain. 

Lemma 6. For all i = 0, . . . ,n and all truth assignments T G TA^ the following 
holds: ifT\=4>i then Pi has a partial copy chain with holes in CI(T). 

Note that Lemma 6 indeed implies Lemma 4: (p is true iff the (unique) truth 
assignment T G TA„, viz. T = 0, satisfies </>„. By Lemma 6, Pi has then a partial 
copy chain with holes in Cl(0) = 0, i.e. a total copy chain. 

We show Lemma 6 by induction on i. 

Base case (i = 0). Suppose given T G TAq with T |= po, i.e. CI(T) = {1, . . . ,k}. 
By definition, Pq has the run {zi := 0, . . . , Zk '■= 0, zq := Zk), which may be writ- 
ten as {zi := 0, . . . , Zk ■■= 0, {zi := zo)°, ■ • ■ , (- 2 fc := Zk-i)°, zq := Zk ) , i.e. it is a 
partial copy chain with holes in {1, . . . , fc} = CI(T). 

Induction step (i ^ i + \). Assume that for a given i, 0 < i < /c — 1, the claim of 
Lemma 6 holds for all T G TA^ (induction hypothesis). Suppose given T G TA^+i 
with T \= 4>i+i. 

If f -I- 1 is even, we have, by definition of <pi+i, T ^ Vcci : pi, i.e. T[xi+i i— > 
tt] \= pi and T[xi+i ff] \= pi. By the induction hypothesis, there are thus two 
partial copy chains rtt and rff with holes in C\{T[xi+i tt]) = CI(T) U Cl(a;i+i) 
and CI(T[a;i+i ff]) = CI(T) U CI(^Xi+i), respectively. 

By interleaving each segment of with a single iteration of appropri- 
ately we can fill the holes from Cl(xi+i); this gives us a run ri of jj Pi that 
is a partial copy chain with holes in CI(T). Similarly, we can fill the holes from 
Cl(^a;i+i) in rff by interleaving each segment with an iteration from 7T-,xi^i; this 
gives us a run T 2 of II that is a partial copy chain with holes in CI(T) 

too. By concatenating r\ and C 2 we get a run of Pi+i that is a partial copy chain 
with holes in CI(T). 

The argumentation for the case that i -I- 1 is odd is similar. 



The ‘Only If’ Direction. As the constant 1 appears only in the initialization 
to zo, Zo can hold the value 1 finally in TTMain only if P„ has a run that copies 
zo (perhaps via other variables) back to Zq. We call such a run a copying run. 
Thus, the ‘only if’ direction of (PG) follows from the following lemma. 

Lemma 7. If P„ has a copying run then p is true. 

Note that, while we could restrict attention to runs of a special form in the 
‘if’-proof, viz. total and partial copy chains, we have to consider arbitrary runs 
here, as any of them may copy zq’s initial value back to Zq. 

In order to enable an inductive proof, we will be concerned with runs that 
are not (necessarily) yet copying runs but may become so if assignments from a 
set A are added at appropriate places. Each assignment from A may be added 
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zero, one or many times. The assignment sets A considered are induced by truth 

def 

assignments T: A = Asg(T) = {zi := Zi-i \ i C CI(T)}. We call such a run a 
potentially copying run with holes in Asg(T). 

Lemma 8. For all i = 0, . . . ,n and for all T G TA^ the following is valid: If 
there is a potentially copying run of Pi with holes in Asg(T) then T \= <fi. 

Note that the case i = n establishes Lemma 7: For the empty truth assignment 
0 G TA„, we have Asg(0) = 0 and a potentially copying run with holes in 0 is 
just a copying run. Moreover, 0 ^ iff </> is true. 

We show Lemma 8 by induction on i. 

Base case (i = Q). Suppose given T G TAq. The only run of Pq is 

r = (^i := 0, . . . , Zfc := 0, Zq := Zk) ■ 

If r is a potentially copying run with holes in Asg(T), assignments from Asg(T) 
can be added to r in such a way that the initial value of Zq influences its final 
value. As we have only assignments of the form Zi := Zi-i available, this can only 
happen via a sub-chain of assignments of the form zi := zq, Z 2 := Zi, . . . , Zk '■= 
Zk-i, where each assignment Zi := Zi-i has to take place after Zi := 0 and 
Zk '■= Zk-i must happen before the final zq := Zk- Therefore, all assignment 
zi := Zq, . . . , Zk '■= Zk-i are needed. This means that Asg(T) must contain all of 
them, i.e. CI(T) must be {1, . . . , A:}. But then T \= (fo- 

Induction step (i — > i+ 1). Suppose given i, Q < i < k — 1, and T G TA^+i. 
Assume that there is a potentially copying run r of Pi+i with holes in Asg(T). 

If i -I- 1 is odd, r is either a run of || Pi or of || Pi. We discuss 

the case || Pi in detail; the case II Pi is analogous. So let r be an 

interleaving of a run s of and t oi Pi. By definition of s consists only 

of assignments from Asg(cci+i) {zj := zj-i \ j G Cl(xi+i)}. As r can be inter- 
leaved with the assignments in Asg(T) to form a copying run, t can be interleaved 
with assignments from Asg(T)UAsg(xi+i) to form a copying run. Therefore, t is a 
potentially copying run with holes in Asg(T)UAsg(a;i+i) = Asg(T[xj+i tt]). By 
the induction hypothesis thus T[xi+i ^ tt] |= (fi. Consequently, T ^ 3xi+i : (fi, 
i.e. T \= 4>i+i- 

If i-l-I is even, there are runs s and t of tt* || Pi and || Pi respectively, 

such that r = s ■ t. It suffices to show that s and t are potentially copying runs 
with holes in Asg(T). An argumentation like in the case H + 1 odd’ then yields 
that T[xi+i 1 -^ tt] ^ (fi and T[xi+i i— > fF] ]= (fi and thus T ^ Vxj+i : (fi, i.e. 
T \= 4>i+i. 

As r = s • t is a potentially copying run with holes in Asg(T) it may be 
interleaved with assignments from Asg(T) to form a copying run r' . Clearly, we 
can interleave its two parts s and t separately by assignments from Asg(T) to 
sequences s' and t' such that r' = s' ■ t' . It is, however, not obvious that s' and 
t' really copy from zq to zq - if they do so, we are done because then s and t 
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are potentially copying runs with holes in Asg(T). Of course, there must be a 
variable Zj such that the value of zq is copied by s' to zj and the value of Zj is 
copied by t' to zq; otherwise Zq cannot be copied to zq by r' . But, at first glance, 
Zj may be different from Zq. It follows from the below lemma, that Zj indeed 
must be Zq, which completes the proof of Lemma 8. 

Lemma 9. Let r be some interleaving of a run of Pi, i = 0, . . . ,n, with assign- 
ments of the form zi := zi-i, I = 1, . . . ,k. Then r eopies none of the variables 
zi, . . . , Zk to some variable. 

This last lemma is proved by induction on i. The interesting argument is in the 
base case; the induction step is almost trivial. 

Base case. Let i = 0 and assume given a variable zj, j G {1, . . . , fc}. Then r is an 
interleaving of {z\ := 0, . . . , Zfc := 0, zq := Zk) with assignments of the form zi := 
zi-\. Assignments of this form can copy only to variables with a higher index. 
Thus, just before the assignment Zj := 0 at most the variables Zj, Zj+\, . . . ,Zk 
can contain the value copied from Zj. The contents of Zj is overwritten by the 
assignment zj := 0. So immediately after Zj := 0 at most Zj+i, . . . ,Zk can contain 
the value copied from zj. This also holds just before the assignment Zj+i which 
overwrites Zj+i; and so on. Just after Zk ■= 0, no variable can still contain the 
value copied from Zj. 

Induction step. Let t > 0 and assume that the claim is valid for i — 1. Any run 
of Pi either starts with (if i is even) or is (if i is odd) an interleaving of a run of 
Pi-i with assignments of the described form. Therefore, r starts with or is an 
interleaving of a run of Pi-i with such assignments. The property follows thus 
immediately from the induction hypothesis. 

5 Conclusion 

In this paper we have presented two complexity results with detailed proofs. 
They indicate that the accounts of [7,6,2,3,13] on efficient and complete data- 
flow analysis of parallel programs cannot be generalized significantly beyond 
gen/kill problems. 

The reductions in this paper apply without change also to the may-constant 
detection problem in parallel programs. In the may-constant problem [9] we ask 
whether a given variable x can hold a given value fc at a certain program point 
p or not, i.e. whether there is a run from the start of the program to p after 
which X holds k. In the NP-hardness proof in Sect. 3 we showed that Zk may 
hold the value 1 at the write-statement iff the given SAT instance is satisfiable 
and, similarly, in Sect 4 that zq may hold 1 at the write-statement iff the given 
QBF instance is true. This proves that the may constant problem is NP-complete 
for loop-free parallel programs and PSPACE-hard for programs with procedures 
and loops. Also the complexity of another data-flow problem, that of detecting 
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faint variables [4] which is related to program slicing [16,15], can be attacked 
with essentially the same reductions. 

For the interprocedural parallel problem the current paper only establishes a 
lower bound, viz. PSPACE-hardness. It is left for future work to study the precise 
complexity of this problem. Another interesting question is the complexity of the 
general intraprocedural problem for parallel programs where we have loops but 
no procedures. 
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1 Introduction 

Assume that we are given the coordinates of n airports. Given an airplane that can fly a 
distance of b miles without refueling, a typical query is to determine the smallest value 
of t such that the airplane can travel between any pair of airports using flight segments 
of length at most b miles, such that the sum of the lengths of the flight segments is not 
longer than t times the direct “as-the-crow-flies” distance between the airports. This 
problem falls under the general category of bottleneck problems. In our case, the stretch 
factor, i.e., the value of t, is a measure of the maximum increase in fuel costs caused 
by choosing a path other than the direct path between any source and any destination. 
(Clearly, this direct path cannot be taken if its length is larger than b miles.) 

Let us formalize this problem. For simplicity, we take the Euclidean metric for the 
distance between two airports. In practice, one needs to take into account the curvature 
of the earth and the wind conditions. 

Let d > 2 be a small constant. For any two points p and q in K.'^, we denote their 
Euclidean distance by \pq\. Let S' be a set of n points in and let G be an undirected 
graph having S as its vertex set. The length of any edge (p, q) of G is defined as \pq\. 
Furthermore, the length of any path in G between two vertices p and q is defined as 
the sum of the lengths of the edges on this path. We call such a graph G a Euclidean 
graph. For any two vertices p and q of G, we denote by \pq\a their distance in G, i.e., 
the length of a shortest path connecting p and q. If there is no path between p and q, 
then \pq\a = oo. The stretch factor t* of G is defined as 



Note that t* = oo, if the graph G is not connected. 

The bottleneck stretch factor problem is to preprocess the points of S into a data 
structure, such that for any real number 6 > 0, we can efficiently compute the stretch 
factor of the subgraph of the complete graph on S containing all edges of length at most 
b. 

Let G = (S,E) denote the Euclidean graph on S containing all edges having length 
at most b. The time complexity of solving the all-pairs-shortest-path problem for G is 
an upper bound on the time complexity of computing the stretch factor of G. Hence, 
running Dijkstra’s algorithm — implemented with Fibonacci heaps — from each vertex 



t* := max 



r \pq\c 
I ImI 
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of G, gives the stretch factor of G, in 0(n^ log n + n\E\) time (c.f , [9]). Note that \E\ 
can be as large as ( 2 ) • Hence, without any preprocessing, we can answer queries in 
O(n^) time. It may be possible to improve the query time, but we are not aware of any 
algorithm that computes the stretch factor in subquadratic time. (For example, we do 
not even know if the stretch factor of a Euclidean path can be computed in o(n^) time.) 

A second solution for the bottleneck stretch factor problem is obtained from the 
observation that there are only ( 2 ) “different” query values b. Hence, if we store all 
( 2 ) different stretch factors, then a query can be solved in 0(log n) time by searching 
with the query value b in the sorted sequence of all ( 2 ) Euclidean distances between 
the pairs of points of S. Clearly, in this case, the preprocessing time and the amount of 
space used are at least quadratic in n. 

This leads to the question whether more efficient solutions exist, if we are satisfied 
with an approximation to the stretch factor of the graph G. 

Let Cl > 1 and C 2 > 1 be real numbers, let G be an arbitrary Euclidean graph 
on the point set S, and let t* be the stretch factor of G. We say that the real number 
t is a {c\, C2)-approximate stretch factor of G, if t/c\ < t* < C 2 C The current paper 
considers the following problem: 

Problem 1. The (ci , C 2 )-approximate bottleneck stretch factor problem is to preprocess 
the points of S into a data structure, such that for any real number & > 0, we can effi- 
ciently compute a (ci, C 2 ) -approximate stretch factor of the subgraph of the complete 
graph on S containing all edges of length at most b. 



1.1 Our Results 

In this paper, we will present a data structure that solves Problem 1 . The general ap- 
proach, which is given in Section 3, is as follows. We partition the sequence of ( 2 ) 
exact stretch factors into 0(log n) subsequences, such that any two stretch factors in 
the same subsequence are approximately equal. Our data structure contains a sequence 
of 0(log n) stretch factors, one from each subsequence. We also store a corresponding 
sequence of 0(log n) distances between pairs of points. The latter sequence is used to 
search in 0(log log n) time in the sequence of 0(log n) stretch factors. The result is 
a data structure of size 0(log n) that can be used to solve the queries of Problem 1 in 
0(log log n) time. The time to build this data structure, however, is at least quadratic in 
n. 

In Section 4, we show that it suffices to use a sequence of 0(log n) approximate 
stretch factors instead of the sequence of 0(log n) exact stretch factors. Since the graphs 
whose stretch factors we have to approximate may have a quadratic number of edges, 
however, we need to make one more approximation step. That is, in Section 5, we 
use Callahan and Kosaraju’s well-separated pair decomposition [7] to approximate the 
graph G containing all edges of length at most & by a graph H having 0(n log n) 
edges and having approximately the same stretch factor. Then we use the algorithm of 
Narasimhan and Smid [13] to compute an approximate stretch factor of the graph iJ. 
In this way, we obtain the main result of this paper: a data structure of size 0(log n), 
query time 0(log log n), and that can be built in subquadratic time. 
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1.2 Related Work 

There has been substantial work on the problem of constructing a Euclidean graph on a 
given set of points whose stretch factor is bounded by a given constant t > 1. A good 
overview of results in this direction can be found in the surveys by Eppstein [11] and 
Smid [15], 

The problem of approximating the stretch factor of any given Euclidean graph has 
been considered by the authors in [13]. There, we prove the following result, which will 
be used in the current paper. 

Theorem 1 ([13]). Let S be a set of n points in let G = {S, E) be an arbitrary 
connected Euclidean graph, let f3 > 1 be an integer constant, and let e be a real con- 
stant, such that 0 < e < 1/2. 7n 0{\E\v}/^ log^ n) expected time, we can compute a 
(2/3(1 + e), 1 + e) -approximate stretch factor of G. 

The proof of this theorem uses the well-separated pair decomposition (WSPD) 
of Callahan and Kosaraju [7]. We use this WSPD in Section 5 to approximate the 
graph containing all edges of length at most 6 by a graph having O(nlogn) edges 
and having approximately the same stretch factor. For other applications of the WSPD, 
see [2,5, 6, 7]. 

To the best of our knowledge, the exact and approximate bottleneck stretch factor 
problems have not been considered before. 

2 Some Preliminary Results 

We start by introducing some notation and terminology. Let S' be a set of n points in 'Ef, 
and let m be the number of distinct distances defined by any two distinct points of S. 
Let bi < b 2 < • • • < bm be the sorted sequence of these distances. Note that m < 

Let Go be the graph on S having no edges. Furthermore, for any i, 1 < i < m, let 
Gi be the i-th bottleneck graph, i.e., the subgraph of the complete graph on S containing 
all edges of length at most Clearly, for any 1, 0 < i < m, G^ is a subgraph of G^+i, 
and Gm is the complete graph on S. For any i,0 < i < m, we denote by t* the (exact) 
stretch factor of the graph Gi. The sequence T = (fg , , • ■ • , 7^) referred to 

as the stretch factor spectrum of S. 

It is clear that determining the stretch factor spectrum of S solves the exact version 
of the bottleneck stretch factor problem. However, this involves determining the stretch 
factor of 0{nf) distinct graphs, which is likely to be prohibitively expensive. 

First, we observe that fg = oo, = 1, and < t* for alH, 0 < i < m. Also, the 
graph Go is not connected, whereas the graph Gm is connected. Let k be the smallest 
index such that the graph Gk is connected. Then ip = = . . . = i^_^ = oo, i^ is 

finite, and 1 = < t*m-i < • • • < ^fc+i < '^'11 henceforth refer to the distance 

5k (corresponding to index k) as the connectivity threshold. 

The following lemma characterizes the connectivity threshold. It is a restatement of 
the well-known folklore theorem that states that the minimum spanning tree is also a 
bottleneck minimum spanning tree. 

Lemma 1. Let T be a minimum spanning tree of S. Then the longest edge in T has 
length 6k. 
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Using Lemma 1 , we can prove an upper bound on the stretch factor of the bottle- 
neck graph Gk- The bound is useful because it suggests that binary search on the stretch 
factor spectrum can be performed efficiently. 

Lemma 2. We have tl < n — 1. 

3 A First Solution 

We start by describing the general idea of our solution to the approximate bottleneck 
stretch factor problem. Let c > 1 be an arbitrary constant. For the preprocessing phase, 
we partition the index set {/c, fc -F 1, . . . , to} into 0(log n) subsets of consecutive inte- 
gers, such that for any two indices i and i' of the same subset, the stretch factors t* and 
t*, are within a factor of c of each other. This partition induces partitions of the two se- 
quences 6i, k < i < m, and t*,k < i < m, into 0(log n) subsequences. For each j, we 
let Qj denote the smallest index in the j-th subset of the partition of {fc, /c -F 1, . . . , to}. 

Our data structure consists of the 0(log n) values Sa^ and . For the query phase, 
given a value 6 > 0, we search for the largest index j, such that Sa^ < b, and report the 
value of . We will prove later that approximates the stretch factor of the subgraph 
of the complete graph on S containing all edges of length at most b. In the rest of this 
section, we will formalize this approach. 

As mentioned above, we fix a constant c > 1. For any integer j > 0, we define 

Xj := {i : k < i < m and c? <t* < 

Since all stretch factors t* are greater than or equal to one, these sets Xj partition the 
set {k,k + 1, . . . , to}. Also, if Xj ^ 0, then there is an index i such that G < t*. 
Since t* < and, by Lemma 2, < n — 1, we have G < n — 1, which implies that 

j < }logc(n- 1)J. 

Let £ be the number of non-empty sets Xj. Then f < 1 -F [log^(n — 1)J. Each 
non-empty set Xj is a set of consecutive integers. We denote these non-empty sets by 
/i, / 2 , . . . , le, and write them as Ij = {oj, Oj -F 1, . . . , Oj+i — 1}, 1 < j < £, where 
fc = oi < 02 < . . . < ae+i = TO -F 1. 

Lemma 3. Let j be any integer such that 1 < j < £, and let i and i' be any two 
elements of the set Ij. Then Ijc < c. 

Now we are ready to give the data structure for solving the approximate bottleneck 
stretch factor problem. This data structure consists of the connectivity threshold Sk, and 
two arrays Z\[l . . .£] and SF[1 . . .£}, where A[j] = Sa^ and SF[j] = t*.. Note that the 
array A is sorted in increasing order, whereas the array SF is sorted in non-increasing 
order. 

Recall that in a query, we get a real number & > 0, and have to compute an approx- 
imate stretch factor t of the graph containing all edges having length at most b. Such a 
query is answered as follows. If & < Sk, then the subgraph of the complete graph on S 
containing all edges of length at most b is not connected. Hence, we report t := oo. If 
b > Sk, then we search in A for the largest index j such that A[f\ < b, and report the 
value of t defined as t := 
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Lemma 4. Assume that b > 5k- Let t* be the exact stretch factor of the subgraph of the 
complete graph on S containing all edges of length at most b. The value oft reported 
by the query algorithm satisfies t / c < t* < ct. 

Proof Consider the index j that was found by the query algorithm. Hence, t = SF [j] = 

. Note that aj G Ij. Let i be the largest index such that Si < b. Then t* = t*, and i 
is also an element of Ij. The claim now follows from Lemma 3 . □ 

Let us analyze the complexity of our solution. We need 0{£) = 0(log n) space to 
store the data structure. If we implement the query algorithm using binary search, then 
the query time is bounded by 0(log £) = 0(log log n). 

It remains to describe and analyze the preprocessing algorithm. First, we compute 
the sorted sequence of m < distances. This takes 0{n^ log n) time. Then we com- 
pute a minimum spanning tree of S. The length of a longest edge in this tree gives us 
the distance Sk, and its index k. (See Lemma 1.) This step also takes 0(n^ log n) time. 
(Note that a minimum spanning tree of a set of n points in can be computed faster. 
The 0(n^ log n)-time bound, however, is good enough for the moment.) Now consider 
the sequence 

1 = C < C -1 < • ■ • < t*k+i <t*k<n-l (1) 

of stretch factors. The index sets Ii, I 2 , ■ ■ ■ , h are obtained by locating the real numbers 
cf 0 < J < ~ 1)J> ™ the sequence (1). Let Tspin) denote the worst-case 

time to compute the exact stretch factor of any Euclidean graph on n points. Then, 
using binary search, we locate c^ in the sequence (1) in time 0{TsF{n) log(m — k + 
1)) = 0(T5i?(n) log n). Hence, we can compute all index sets Ij, 1 < j < f, in 
0{TsF{n) log^ n) total time. Given these index sets, we can compute the two arrays 
A and SF, in 0{TsF{n) logn) time. If we write the constant c as 1 + e, then we have 
proved the following result. 

Theorem 2. Let S be a set of n points in R.'^, and let e > 0 be a constant. For the 
(1 -F e, 1 -F e)- approximate bottleneck stretch factor problem, there is a data structure 
that can be built in O {ri^logn + TsF{n)\o^ n) time, that has size O(logn), and 
whose query time is bounded by 0(log log n). 

As mentioned in Section 1, the time complexity for computing the stretch factor of 
an arbitrary Euclidean graph is bounded by 0{nf). Even though it may be possible to 
improve this upper bound, it is probably very hard to get a subquadratic time bound. 
Therefore, in the next section, we show that the preprocessing time can be reduced, at 
the cost of an increase in the approximation factor. The main idea is to store approximate 
stretch factors in the array SF. 

4 An Improved Solution 

Here we exploit the fact that approximate stretch factors can be computed more effi- 
ciently than exact stretch factors. In the previous section, we fixed a constant c > 1, and 
partitioned the sequence (I) of exact stretch factors into 0(log n) subsets, such that any 
two stretch factors in the same subset are within a factor of c of each other. We obtained 
this partition, by locating the values cf0<j < [logg(n — 1)J , in the sorted sequence 
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(1). In this section, we fix two additional constants ci and C 2 that are both greater than 
or equal to one. For any i, k < i < m, let ti be a (ci, C 2 ) -approximate stretch factor of 
the bottleneck graph Gi. Hence, we have U/ci < t* < C 2 ti. We will show how to use 
the sequence tm, tm-i, ■■■ ,tk of approximate stretch factors to partition the index set 
{k,k + 1, . . . , to} into 0(log n) subsets, such that for any two indices i and i' within 
the same subset, the exact stretch factors t* and t*, are approximately equal. (The ap- 
proximation factor depends on c, ci, and C 2 .) This partition is obtained by locating the 
values G in the sequence tm, tm-i, ■ ■ ■ ,tk- Here, we have to be careful, because the 
values ti are not sorted. They are, however, “approximately” sorted, and we will see 
that this suffices for our purpose. 

Let X > 0 be a real number. We want to use binary search to “approximately” locate 
X in the “approximately” sorted sequence tm, tm-i, ■ ■ ■ ,tk- We specify this algorithm 
by its decision tree^ This tree is a balanced binary tree that enables us to search in a 
sequence of numbers that have indices k, k+1, . . . ,m. More precisely, the leaves of the 
tree store the indices k, k + 1, . . . ,m, in this order, from left to right, and each internal 
node u of the tree stores the smallest index that is contained in the right subtree of u. 
Given the real number x > 0, we search as follows: 

Algorithm search (x) 

u := root of the decision tree; 
while u ^ leaf do 

j := index stored in u\ 

if X < tj then u := right child of u else u := left child of u endif 
endwhile; 

return the index stored in u 



Lemma 5. Let x > Q be a real number, and let z be the index that is returned by 
algorithm search{x). For each i, k < i < z, we have t* > xjc\, whereas for each i, 
z < i < m, we have t* < C 2 X. 

Hence, running algorithm search (x) implicitly partitions the sequence t*f. , , . . ., 

tm of exact stretch factors into the following three subsequences: (i) tl, tl^^ ,■■■ , tl_i; 
these are all greater than or equal to x/ci, (ii) t*, and (iii) t*+i,t*+ 2 ^ ■ • ■ ) these are 
all less than C 2 X. 

We are now ready to give the algorithm that partitions the sequence t^ , t^^^ , . ■ ■ ,tm 
of exact stretch factors into 0(log n) subsets, such that any two stretch factors in the 
same subset are approximately equal. First, we run algorithm search{c). Let z be the 
index returned. Then we report the two sets {z} and {z + l,z + 2,. ■ ■ , to} of indices. 
Next, we run algorithm search{c^) on the index set {k, k + 1, . . . , z — \}. This results 
in a partition of the latter set into three subsets. The “lasf’ two subsets are reported, 
whereas the “firsf’ subset is partitioned further by running algorithm search{c^). After 
0(log n) iterations, we obtain the partition we are looking for. 

Let f be the number of non-empty index sets that are computed by this algorithm. 
As in Section 3, we denote these hy I\,l 2 , ■ ■ ■ , h, and write them as Ij = {aj,aj + 

* Note that this decision tree is not constructed (its size is quadratic in n), it is just a convenient 
way to describe the algorithm. The decision tree represents all possible computations of the 
algorithm on any input x. 
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1, . . . , ttj+i — 1}, 1 < j < where k = ai < a 2 < ■ ■ ■ < a^+i = m + 1. It is easy to 
see that £ = 0(log n). 

Lemma 6. Let y be any integer such that 1 < j/ < 4 and let i and i' be any two 
elements of the set ly. Then \/{cc\C 2 ) < t* /t*, < cciC 2 - 

The data structure for solving the approximate bottleneck stretch factor problem 
consists of the connectivity threshold Sk, and two arrays A[1 . . .£] and SFapprox [£■■■£], 
where A[j\ = Saj and SFapprox Ij] = ta^ 

The query algorithm is basically the same as before. Given any real number 6 > 0, 
we do the following. \fh<5k, then the subgraph of the complete graph on S containing 
all edges of length at most b is not connected. Hence, we report t := oo. If & > 5k, then 
we search in A for the largest index j such that A[f\ < b, and report the value of t 
defined as f := SFapprox [j]- 

Lemma 7. Assume that b > Sk- Let t* be the exact stretch factor of the subgraph of the 
complete graph on S containing all edges of length at most b. The value oft reported 
by the query algorithm satisfies tj (ccfc 2 ) < t* < ccic^t. 

Proof Let j be the largest index such that A[j] < b. Then t = SFapprox[j] = tay Let 
i be the largest index such that 5i < b. Then t* = t*. Since i and aj both belong to 
the index set Ij, Lemma 6 implies that l/(cciC 2 ) < t* /t*a. < cc\C 2 - The lemma now 
follows from the fact that 1/ci < t*aJtoj < C 2 . □ 

It is clear that the data structure has size O(logn), and that the query time is 
bounded by 0(log log n). In the rest of this section, we analyze the time that is needed 
to construct the data structure. We will use the following notation. 

- Tmst {n) \ the time needed to compute a minimum spanning tree of a set of n points 
in 

- Trank (n) '■ the time needed to compute the rank of any positive real number 5 in the 
set of distances in a set of n points in (The rank of 5 is the number of distances 
that are less than or equal to b.) 

- TapproxSF{n)\ the time needed to compute a (ci, C 2 ) -approximate stretch factor of 
any bottleneck graph on a set of n points in R'*. 

- Tsei (n) : the time needed to compute the i-th smallest distance in a set of n points 
in R'^, for any i,l <i < ( 2 ) • 

The preprocessing algorithm starts by computing a minimum spanning tree of the 
point set S. Let 5 be the length of a longest edge in this tree. Note that the rank of 5 is 
equal to k. Hence, we can compute the distance 6k = 6, and the corresponding index 
k, in 0{TMST{n) + Trank{n)) time. Given k and Sk, we can compute the partition of 
{k,k+l, . . . , m} into non-empty index sets Ij, in 0{TapproxSF{n) log^ n) time. Given 
this partition, we can compute the array SFapprox[£ ■ ■ ■ £] in 0{TapproxSF{n) log n) 
time. To compute the array Z\[l . . . f], we have to solve 0(log n) selection queries of 
the form “given an index j, compute the aj-th smallest distance Sa^ in the point set 
S”. One such query takes Tsei{n) time. Hence, we can compute the entire array A in 
0{Tsei{n) logn) time. 
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We observe that Trank{n) = 0{Tsei{n) log n): We can compute the rank of a pos- 
itive real number 6, by performing a binary search in the index set {1, 2, . . . , ( 2 )}- 
During this search, comparisons are resolved in Tgei (n) time. 

If we write the constant c as 1 -F e, then we obtain the following result. 

Theorem 3. Let S be a set of n points in K.'^, and let e > 0, ci > 1, and C 2 > \ be 
constants. Forthe ((l-Fe)c^C 2 , {1 + t)cic\)-approximate bottleneck stretch factor prob- 
lem, there is a data structure that can be built in 0(TMST{ti) -F TapproxSpin) log^ n -F 
Tsei{n) log n) time, that has size 0(log n), and whose query time is 0(log log n). 



5 A Fast Implementation of the Improved Algorithm 

In order to apply Theorem 3, we need good upper bounds on the functions TMsrin), 
Tsei{n), and TapproxSF{n). For the first two functions, subquadratic bounds are known. 
Theorem 1 implies an upper bound on TapproxSF{n). We run the algorithm of [13] 
on the bottleneck graph. Since such a graph can have a quadratic number of edges, 
however, this gives a bound that is at least quadratic in n. In Section 5.1, we will show 
that the bottleneck graph Gi can be approximated by a graph Hi having fewer edges. 
That is. Hi has 0{n log n) edges, and its stretch factor is approximately equal to that of 
Gi. This will allow us to approximate the stretch factor of Gi in subquadratic time. 

The computation of the graph Hi is based on the well-separated pair decomposition, 
devised by Callahan and Kosaraju [7]. We briefly review well-separated pairs and some 
of their relevant properties. 

Definition 1. Let s > 0 be a real number, and let A and B be two finite sets of points 
in We say that A and B are well-separated w.r.t. s, if there are two disjoint d- 
dimensional balls Ca and Gb, having the same radius, such that (i) Ga contains all 
points of A, (ii) Gb contains all points of B, and (Hi) the distance between Ca and Cb 
is at least equal to s times the radius of C a- 

We will assume that s is a constant, called the separation constant. 

Lemma 8. Let A and B be two finite sets of points that are well-separated w.r.t. s, let 
X and p be points of A, and let y and q be points of B. Then (i)\xy\ < (1 -F 4/s) • |pq|, 
and(ii) \px\ < (2/s) • \pq\. 



Definition 2 ([7]). Let S be a set of n points in and s > 0 a real number. A 
well-separated pair decomposition (WSPD) for S (w.r.t. s) is a sequence of pairs of 
non-empty subsets of S, {Ai, Bi}, {A 2 , B 2 }, . . . , {Ag, B^}, such that 

1. Ai C\ Bi = ib, for all i = 1,2, ... ,£, 

2. for any two distinct points p and q of S, there is exactly one pair {Ai, Bi} in the 
sequence, such that (i) p G Ai and q € Bi, or (ii) p € Bi and q G Ai, 

3. Ai and Bi are well-separated w.r.t. s, for alii = 1,2, ... ,L 



The integer f is called the size of the WSPD. 
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In [5], Callahan shows that a WSPD of size I = 0(n log n) can be computed, such 
that each pair {Ai,Bi\ contains at least one singleton set. This WSPD is computed 
using a binary tree T, called the split tree. We briefly describe the main idea. The split 
tree is similar to a fed -tree. Callahan starts by computing the bounding box of the points 
of S, which is successively split by d-dimensional hyperplanes, each of which is or- 
thogonal to one of the axes. If a box is split, he takes care that each of the two resulting 
boxes contains at least one point of S. As soon as a box contains exactly one point, the 
process stops (for this box). 

The resulting binary tree T stores the points of S at its leaves; one leaf per point. 
Also, each node rt of T is associated with a subset of S. We denote this subset by S'„; it 
is the set of all points of S that are stored in the subtree of u. 

The split tree T can be computed in 0{n log n) time. Callahan shows that, given T, a 
WSPD of size f = 0(n log n) can be computed in 0(n log n) time. Each pair {Ai,Bi\ 
in this WSPD is represented by two nodes Ui and Vi of T, i.e., we have Ai = and 
Bi = Sv^ . Since at least one of Ai and Bi is a singleton set, at least one of Ui and Vi is 
a leaf of T. 

Theorem 4 ([5]). Let S be a set ofn points in and s > 0 a separation constant. In 
0{n log n) time, we can compute a WSPD for S of size 0(n log n) such that each pair 
{Ai, Bi} contains at least one singleton set. 

5.1 Approximating the Bottleneck Graph 

Let & > 0 be a fixed real number, and let G be the Euclidean graph on the point set 
S containing all edges of length at most b. In this section, we show that we can use 
well-separated pairs to define a graph H whose stretch factor approximates that of G. 
In Section 5.2, we will give an algorithm that computes such a graph H having only 
O(nlogn) edges. 

Let s > 4 be a separation constant, and consider an arbitrary well-separated pair 
decomposition {Ai, Bf} , {A 2 , B 2 } , . . . , [Ai, B(} fox i\\e point set S. For any index i, 
1 < i < f, let Xi e Ai and yi G Bi be two points for which \xiyi \ is minimum. 

The graph H has the points of S as its vertices, and contains all edges (xi , yi) whose 
length is less than or equal to b. 

Lemma 9. Let p and q be any two points of S such that \pq\ < b. Then \pq\H < 
(s-F4)/(s-4) • \pq\. 

Proof. The proof is basically the same as Callahan and Kosaraju’s proof in [6] that the 
WSPD yields a spanner for S. □ 

Lemma 10. Let Lq and denote the exact stretch factors of the graphs G and H, 
respectively. We have {s — A)/{s + A) ■ < Iq < t}j. 

5.2 Computing the Approximation Graph H 

We saw in the previous subsection that the graph H approximates the bottleneck graph 
G. In this section, we show how this graph H can be computed if we use an appropri- 
ate WSPD. Consider a WSPD {Ai,Bi}, {^ 2 , 52 }, ■ • ■ , [Ai, Bi} in which each pair 
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{Ai,Bi] contains at least one singleton set. By Theorem 4, such a WSPD of size 
f = 0{n log n) can be computed in 0(n log n) time. 

The main problem is that we have to compute for each pair {Ai, Bi\ in this WSPD, 
the points Xi G A^ and yi G Bi for which \xiyi \ is minimum. Hence, if Ai is a singleton 
set, i.e., Ai = {xi}, then we have to compute a nearest-neighbor yi of Xi in the set Bi. 
We will show that by traversing the split tree T that gives rise to this WSPD, all these 
pairs (xi,yi), 1 < i < £, can be computed efficiently. 

Recall that for any node u of the split tree T, we denote by Su the subset of S that 
is stored in the subtree of u. Also, each pair {Ai, Bi} in the WSPD is defined by two 
nodes Ui and Vi of T. That is, Ai = and Bi = Sv^ . 

We store with each node u of T, a list of all leaves v such that the two nodes u and 
V define a pair in the WSPD. (Hence, v defines a singleton set in this pair.) 

Let DS be a data structure that stores a set of points in that supports nearest- 
neighbor queries of the form “given a query point q G find a point in the set that is 
nearest to q”, and that supports insertions of points. 

The algorithm that computes the required closest pair of points in each well-separat- 
ed pair of point sets, traverses the nodes of T in postorder. To be more precise, let u be 
an internal node of T, and let u' and u" be the two children of u. At the moment when 
node u is visited, the nodes u' and u" store nearest-neighbor data structures DS (u') and 
DS {u" ) storing the point sets Su> and Su" , respectively. If | S'!!' | < | S'«" | , then we insert 
all points of Su' into DS{u"). Otherwise, all points of Su" are inserted into DS{u'). 
Hence, after these insertions, we have a nearest-neighbor data structure DS (u) storing 
the point set Su- For each leaf vofT such that u and v define a pair in the WSPD, we 
query DS (u) to find a point of Su that is nearest to the point stored at leaf v. 

During this postorder traversal of the split tree T, we get all pairs {xi,yi), I < 
i < £. Clearly, the approximation graph H can be computed from these pairs, in time 
0{£) = O(nlogn). 

We analyze the running time of this algorithm. The number of nearest-neighbor 
queries is equal to the number £ of pairs in the WSPD. For any internal node u of T, 
the data structure DS{u) is obtained by inserting the points from the child’s structure 
whose subtree is smaller, into the structure of the other child of u. It is easy to prove 
that in this way, each point of S is inserted at most log n times. The total number of 
insertions is therefore bounded by 0{n log n). 

Let QNN{no) and lNN{no) denote the query and insertion times of the data struc- 
ture DS, respectively, if it stores a set of no points. Since no < n at any moment during 
the algorithm, we have proved the following result. 

Lemma 11. Let S be a set ofn points in R"^. After 0{n{Q nn {n) + Inn {n)) logn) pre- 
processing time, we can compute the approximation graph H of any bottleneck graph 
G, in O(nlogn) time. 

In order to apply Lemma 11, we need to specify the data structure DS. This data 
structure stores a set of points in R'^, and supports nearest-neighbor queries and inser- 
tions of points. We can obtain such a semi-dynamic data structure by applying Bent- 
ley’s logarithmic method, see [3,4]. This technique transforms an arbitrary static data 
structure for nearest-neighbor queries into one that also supports insertions of points. 
To be more specific, let DS’^ be a static data structure storing a set of n points in 
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that supports nearest-neighbor queries in Q%pj{n) time, and that can be built in 
P^N (^) time. The logarithmic method transforms DS‘^ into a semi-dynamic structure 
DS, in which nearest-neighbor queries can be answered in 0{Q%j^ (n) log n) time, and 
in which points can be inserted in (n) jn) log n) amortized time. 

Corollary 1. Let S be a set ofn points in let fi > 1 be an integer constant, and let e 
be a real constant, such that 0 < e < 1 /2. After 0{n Q%j^ (n) log^ n + (n) log^ n) 

preprocessing time, we can compute a (ci, C 2 )-approximate stretch factor, where c\ = 
2/3(1 -b e)^ and C 2 = 1 -b e, of any bottleneck graph in log^ n) expected time. 



If we combine Theorem 3 and Corollary 1, then we obtain the main result of this 
paper. 

Theorem 5. Let S be a set ofn points in R'^, let (3 >1 be an integer constant, and let 
e be a real constant, such that 0 < e < 1/2. /n 

O (^n « + PNNi.'n) log^ n + log^ n -b Tsei{n) log n^ 

expected time, we can compute a data structure of size 0(log n), such that for any real 
number b > 0, we can compute, in O(loglogn) time, a real number t, such that 



1 

4/32(1 -be )6 



t<t* < 2/3(1 -b t)H, 



where t* is the exact stretch factor of the Euclidean graph containing all edges of length 
at most b. 



We conclude this section by giving concrete bounds on the preprocessing time. We 
start with the case when the dimension d is equal to two. The static nearest-neighbor 
problem can be solved using Voronoi diagrams, and a data structure for point location 
queries, see Preparata and Shamos [14]. For this data structure, we have = 

O(logn), and Pf^fq{n) — O(nlogn). Chan [8] gives a randomized distance selec- 
tion algorithm, whose expected running time Tsei{n) is bounded by 0(n^/^ log®^^ n). 
Hence, if d = 2, the expected time needed to build the data structure of Theorem 5 is 
bounded by log® n + log®^® n). If /3 = 2, then the expected preprocess- 

ing time is roughly For /3 = 3, it is roughly For larger values of /3, the time 
bound remains roughly but then the approximation ratio increases. 

Assume that d > 3. Agarwal, in a personal communication to Dickerson and Epp- 
stein [10], has shown that 

T,ei{n) = ( 2 ) 

where p is an arbitrarily small positive real constant. Agarwal and Matousek [1], and 
Matousek and Schwarzkopf [12] have given a static nearest-neighbor data structure for 
which n (n) log^ n-bP/r^ (n) log^ n is asymptotically smaller than the quantity on 
the right-hand side of (2). Hence, the expected time needed to build the data structure 
of Theorem 5 is bounded from above by log® n -b This 

becomes i.e., subquadratic, if we take (3 = 2. Again, for larger 

values of (3, we get the same time bound, but a larger approximation ratio. 
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6 Concluding Remarks 

We have given a subquadratic algorithm for preprocessing a set S of n points in into 
a data structure of size 0(log n) such that for an arbitrary query value 6 > 0, we can, in 
0(log log n) time, compute an approximate stretch factor of the bottleneck graph on S 
containing all edges of length at most b. This result was obtained by (i) approximating 
the sequence of (2) different stretch factors of all possible bottleneck graphs, and (ii) 
approximating bottleneck graphs by graphs containing only 0{n log n) edges. 

Our algorithms need exact solutions for computing minimum spanning trees, and 
nearest-neighbor queries, distance selection queries, and distance ranking queries. It 
would be interesting to know if approximation algorithms for these problems can be 
used to speed up the preprocessing time. 
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Abstract Coalgebras for a functor on the category of sets subsume many formu- 
lations of the notion of transition system, including labelled transition systems, 
Kripke models, Kripke frames and many types of automata. This paper presents 
a multimodal language which is bisimulation invariant and (under a natural com- 
pleteness condition) expressive enough to characterise elements of the underlying 
state space up to bisimulation. Like Moss’ coalgebraic logic, the theory can be 
applied to an arbitrary signature functor on the category of sets. Also, an upper 
bound for the size of conjunctions and disjunctions needed to obtain characteris- 
tic formulas is given. 



1 Introduction 

Rutten [17] demonstrates that coalgebras for a functor generalise many notions of tran- 
sition systems. It was then probably Moss [13] who first realised that modal logic con- 
stitutes a natural way to formulate bisimulation-invariant properties on the state spaces 
of coalgebras. Given an arbitrary signature functor on the category of sets, the syntax 
of his coalgebraic logic is obtained via an initial algebra construction, where the ap- 
plication of the signature functor is used to construct formulas. This has the advantage 
of being very general (few restrictions on the signature functor), but the language is 
abstract in the sense that it lacks the usual modal operators □ and O. 

Other approaches [8,9,11,15,16] devise multimodal languages, given by a set of 
modal operators and a set of atomic propositions, which are based on the syntactic anal- 
ysis of the signature functor (and therefore only work for a restricted class of transition 
signatures). 

This paper aims at combining both methods by exhibiting the underlying semantical 
structures which give rise to (the interpretation of) modal operators with respect to coal- 
gebras for arbitrary signature functors. After a brief introduction to the general theory 
of coalgebras (Section 2), we look at examples of modal logics for two different signa- 
ture functors in Section 3 . The analysis of the semantical structures, which permit to use 
modalities to formulate properties on the state space of coalgebras, reveals that modal 
operators arise through a special type of natural transformation, which we chose to call 
“natural relation”. Abstracting away from the examples. Section 4 presents a concrete 
multimodal language which arises through a set of natural relations and can be used 
to formulate predicates on the state space of coalgebras for arbitrary signature functors. 
We then prove in Section 5 that the interpretation of the language is indeed invariant un- 
der (coalgebraic) bisimulation. In the last section we characterise the expressive power 
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of the language, and prove that under a natural completeness condition, every point of 
the state space can be characterised up to bisimulation. We also give an upper bound 
for the size of conjunctions and disjunctions needed to obtain characteristic formulas. 

The present approach is elaborated in more detail in [14], which also contains the 
proofs of the theorems which are stated in this exposition. 



2 Transition Systems and Coalgebras 



Given an endofunctor T : Set ^ Set on the category of sets and functions, a T- 
coalgebra is a pair (C, 7 ) where C is a set (the state space or carrier set of the coalgebra) 
and 7 : C — > TC is a function. Using this definition, which dualises the categorical 
formulation of algebras, many notions of automata and transition systems can be treated 
in a uniform framework. We only sketch the fundamental definitions and refer the reader 
to [7,17] for a more detailed account. 

Example 1 (Labelled Transition Systems). Suppose L is a set of labels. Labelled tran- 
sition systems, commonly used to formulate operational semantics of process calculi 
such as CCS, arise as coalgebras for the functor TX = V{L x X). Indeed, given a set 
C of states and a transition relation Ri for each label I G L, we obtain a T -coalgebra 
(C, 7) where 7(c) = {{l,c) G L x C \ c Ri c}. Conversely, every coalgebra struc- 
ture 7 : C — > TC gives rise to a family of transition relations {Ri)i^l via c Ri d iff 
{l,c') G 7(c). 

Many types of automata can also be viewed as coalgebras for an appropriate type of 
signature functor on the category of sets: 

Example 2 (Deterministic Automata). Let TX = (O x X)^ + E and {C, 7 : C — > TC) 
be a T-coalgebra. Given an element of the state space c G C, the result 7(c) of applying 
the transition function is either an error condition e G i? or a function f : I ^ OxC G 
(O X Cy . Supplying an input token i G I, the result /(i) of evaluating / gives us an 
output token o G O and a new state d G C. 



Morphisms of coalgebras are functions between the corresponding state spaces, which 
are compatible with the respective transition structures. Dualising the categorical for- 
mulation of algebra morphisms, a coalgebra morphism between two T-coalgebras 
(G, 7) and {D, S) is a function f : C ^ D such that T f oy = So f. Diagrammatically, 
/ must make the diagram 




commutative. The reader may wish to convince himself that in the case of labelled 
transition systems above, a coalgebra morphism is a functional bisimulation in the sense 
of Milner [12]. It is an easy exercise to show that coalgebras for a functor T, together 
with their morphisms, constitute a category. 
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One important feature of the functional (ie. coalgebraic) formulation of transition 
systems is that every signature functor comes with a built in notion of bisimulation. 
Following Aczel and Mendler [1], a bisimulation between two coalgebras (C, 7 ) and 
{D,6) is a relation B C C x D, that can be equipped with a transition structure (3 : 
B ^ TB, which is compatible with the projections ttc '■ B ^ C and ttd : B ^ D. 
More precisely, B C C x D is a bisimulation, if there exists j3 : B ^ TB such that 




commutes. Again, the reader may wish to convince himself that in the case of labelled 
transition systems, coalgebraic bisimulations, as just defined, are indeed bisimulations 
of labelled transition systems. 

3 Modal Logic for Coalgebras: Examples 

We exemplify the connection between modal logics and coalgebras for a functor by 
means of the examples given in the previous section. In both examples we observe 
that the modalities and atomic propositions of the respective languages arise via spe- 
cial types of natural transformation, the “natural relations” already mentioned in the 
introduction. The general theory developed in the subsequent sections is based on this 
observation in that it shows, that every set of natural relations induces a multimodal lan- 
guage which allows to formulate bisimulation invariant properties on the state spaces 
of coalgebras for an arbitrary signature functor. 

3.1 Labelled Transition Systems 

Consider the functor TX = V{L x X) on the category of sets and functions. We have 
demonstrated in Example 1, that T-coalgebras are labelled transition systems over the 
set L of labels. It is well known that Hennessy-Milner logic [ 6 ] (also discussed in [20]) 
is an expressive, bisimulation invariant language, which allows to formulate predicate 
on the state space of labelled transition systems. 

Consider the set C of formulas built up from the atomic propositions ft, if, conjunc- 
tions, disjunctions and a pair of modal operators and O; for every I G L. Given a 
T-coalgebra (labelled transition system) (C, 7 ) and a formula (j) G £, we write |</>](c, 7 ) 
for the set {c G C \ (c, 7 ) |= cj)} of points c G C, which satisfy the formula </> with re- 
spect to the transition structure 7 , and drop the subscript {C, 7 ) if the transition structure 
is clear from the context. Omitting the straightforward interpretation of atomic propo- 
sitions, conjunctions and disjunctions, the interpretation of the formula is given by 

= {cGC\Wc'g C.{1, c') G 7 (c) ^ c' e I0l(c.7)} (1) 

for any I G L. Note that the same definition can be used for any carrier set and transition 
structure. This leads us to define, given I G L, a parameferised relation ^i{A) QT Ax 
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A, given by 



afj.i(A)a (l,a)€a (2) 

for an arbitrary set A,a& TA and a & A. Using this definition, we ean now reformulate 
(l)as 



Pi^kc,^) = {c€C\yc'e C. 7 (c) MC) c' ^ c'e (3) 

and obtain the interpretation of the existential modality via 



lOikliCn) = {c€C\3c'€ C.7(c) w(C) c' A c' e M(c,7)}- (4) 



The faet that (2) is a eanonieal definition, whieh works for any set A, is witnessed by 
the following universal property: For any funetion f : A ^ B, the diagram of sets and 
relations 



TA 



IJ-i(A) 

1 



A 



+ G(Tf) 



TB 



1 



B 



(5) 



eommutes (where we write R : A-\-^ B for a relation R C AxB and G(/) for the graph 
of a funetion; eomposition of the arrows in the diagram is relational eomposition). Pa- 
rameterised relations, whieh satisfy eondition (5) will be ealled natural relations in the 
sequel. Thus summing up, one ean say that natural relations give rise to the interpreta- 
tion of modalities. 



3.2 Input/Output Automata 

In Example 2 we have seen that deterministie input/output automata are eoalgebras for 
the funetor TX = (O x X)^ + E. We now go on to demonstrate that the modalities 
needed to deseribe properties of these automata also arise via parameterised relations, 
that is, relations whieh satisfy the naturality eondition (5). 

Given a T eoalgebra (C, j : C ^ TC) and a state c G C, the modality of interest 
here deseribes the behaviour of a (possible) sueeessor state, whieh arises after supplying 
an input token, if the result 7(c) of applying the transition funetion does not yield an 
error eondition e G E. For i G I and an arbitrary set A, we eonsider the relation 
hti{A) C TA X A, given by 

o Pi{A) a iff 3f : I ^ {O X A) G {O X AY .a = inl(/) A o f(i) = a, 

where ini : (O x — > (O x + if is the eanonieal injeetion and tta denotes the 
projeetion funetion O x A ^ A. Note that this parameterised relation also satisfies the 
naturality eondition (5) and allows us to define a pair of modalities Di and Oi using 
equations (3) and (4). 

In order to obtain a language whieh allows to speeify the behaviour of a state c G C, 
we furthermore need atomie propositions to be able to formulate that the applieation 
7(c) of the transition funetion yields an error eondition e G E and that - in ease 7(c) G 
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{O X Cy - supplying an input token i G I yields an output token o G O. This is taken 
care of by a set of atomic propositions {pe \ e G E} LI {p(i,o) \ (i,o) G I x O}. The 
interpretation of the atomic propositions in this example is straightforward: 

IPel(c, 7 ) = {c&C\ 7 (c) = inr(e)} (6) 

IP(*,o)1(c,7) = {cG C \ 3f G {O X Cy.yc) = inl(/) A tto o f{i) = o}, (7) 

where inr : E ^ {O x C)^ + E is again the canonical injection and ttq ■ O x C ^ O 
denotes the projection function. In both cases it deserves to be mentioned that the atomic 
propositions arise as subsets of the set T1 (where we write 1 = {*} for the terminal 
object in the category of sets and !c : C* — > 1 for the unique morphism). To be more 
precise, consider the sets 

Pe\ri = {inr(e) \ e G E} (8) 

P{t,o)\Ti = {inl(/) I / G (O X 1)^ A 7TO o /(i) = o}, (9) 

where in this case inr : E ^ {O x 1)^ + E and ini : {O x lY ^ {O x lY + E. 
Using the subsets defined by (8) and (9), we now recover the interpretation of the atomic 
propositions, originally given by (6) and (7) as 

be](C.7) = O l)~^(Pe\Tl) 

b(i.o)](C,7) = i'T'-C ° l)~^{P{i,o)\Tl), 

respectively. Thus one can say that atomic propositions in modal logics for T -coalgebras 
arise as subsets of the set Tl. 



4 From Natural Relation to Modal Logics 

If T : Set ^ Set is an endofunctor, the examples in the previous section suggest, that 
modal logics for coalgebras of a functor are induced by a set of natural relations for T 
and a set of predicates on Tl. The remainder of the paper is devoted to showing that 
this is indeed the case. We start by exhibiting the modal language which arises from a 
set of natural relations and a set of atomic propositions and show in the subsequent sec- 
tions, that the language presented is bisimulation invariant and (under a completeness 
condition on the set of relations) strong enough to distinguish non-bisimilar points. 



4.1 Natural Relations 

Categorically speaking, natural relations are natural transformations between functors 
mapping from the category Set of sets and functions to the category Rel of sets and 
relations. This is captured in 
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Definition 1 (Natural Relations). Suppose T is an endofunctor on the category of sets. 
A natural relation /or T is a natural transformation 

I oT^ I, 

where X : Set — > Rel is the identity on sets and sends every function to the relation 
given by its graph. 

Unravelling the definition of natural transformations, the reader might wish to eonvince 
himself that this definition eaptures the naturality requirement present in the examples. 
Note that by moving from a relation R : A-\-^ B to a funetion Sr : A 'P{B) given 
by S_r(o) = {6 G i? I a i? 6}, we ean also view natural relations X o T ^ X as 
natural transformations T — > "P (where V is the eovariant powerset funetor). This is 
essentially due to the faet that the category Rel of sets and relations appears as the 
Kleisli category of the powerset monad^ Also every set A of subsets of T1 gives rise 
to a natural transformation P _4 : T V{A), where V{A) is the constant functor which 
sends every set to V{A). This is elaborated in [14]. 

For the remainder of this section we assume, that T : Set ^ Set is an endofunctor 
on the category of sets and functions, A4 is a set of natural relations for T, Aisa set of 
subsets of T1 and k is a cardinal number. 



4.2 Syntax and Semantics of C(Ai, A, k) 

As it is often the case with modal languages, we sometimes need infinitary constructs 
in the language to obtain enough expressive power. In order to be able to deal with the 
general case later, we fix a cardinal number k, which serves as upper bound for the 
size of conjunctions and disjunctions. The language £(A4 , A, k) induced by the set A4 
of natural relations and A of atomic propositions is given by the least set of formulas 
containing 

- An atomic proposition pa for every a & A 

- The formulas /\ and \/ ^, if is a set of formulas of cardinality less than or equal 
to K, and 

- The formulas and for every /i G A4 and every formula / of £(A4, .4, k). 

Note that C{M,A, k) contains as a special case the formulas /\ 0 and V 0, which we 
shall abbreviate to tt and if, respectively. In order to simplify the exposition of the 
semantics of£(A4 ,A,k), we introduce an easy bit of notation. 

Definition 2. Suppose R £ Ax B is a relation. Then R induces two operations, which 
we denote by and Or, both mapping V{B) V{A), given by 

£ B) = {a G A\3b G B.a Rb Ab G b} 

n_R(b C R) = {a G A I V& G B.a R b b G b}. 

* We would like to thank one of the anonymous referees for pointing this out. 
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Following Moss [13], we introduce a further operator : V{B) — > V{A) defined by 
Afl(bCB) = Dfl(b)nf|Oi^({&}), 

beb 

which we will use later. 

The semantics of | 0 ](c, 7 ) of ^ formula <j) G C{M,A, k) can now be inductively de- 
fined: 

- |po](c, 7 ) = (A!c o 7)~^(a) for atomic propositions Pa given by a € A 

- IA^](c. 7) = ri0e<i>M(c.7) [V^](C,7) = U0G<i>M(c.7) for Conjunctions 

and disjunctions (following standard conventions, we set |tt] (,7,^) = C and |ff] (< 7 , 7 ) 
= 0), and 

- Pa«<^](c, 7 ) = I^m(C)oG( 7 )([ 01 (c, 7 )) and |A/i 0 l(c, 7 ) = A^(c)oG( 7 )(I</'](c, 7 )) for 

the modal operators. 

If the transition structure is clear from the context, we sometimes abbreviate |</)] (^c.'y) to 
Ifijc (and sometimes even to |0]). In case we want to emphasise that a formula holds 
at a specific point c G C of the underlying set, we also write c\=^ </> for c G |0] (( 7 , 7 ) ■ 

5 Invariance Properties of £(A4, A, k) 

In this section, we demonstrate that £(A4,A, k) is an adequate logic for T-coalgebras. 
We do this by proving that the semantics of formulas is invariant under coalgebra mor- 
phisms and that bisimilar elements of the state space of coalgebras satisfy the same set 
of formulas. 

For the whole section assume that T is an endofunctor on Set, Ad is a set of natural 
relations for T, Al is a set of subsets of T1 and «; is a cardinal number. 

Theorem 1 (Morphisms Preserve Semantics). Suppose f : (C, 7) — > {D,S) is a 
morphism of coalgebras. Then 

IHc = /“AMr?) 

for all formulas <f> of L{M.,A, k). 

When proving the theorem, naturality of the relations is essential. We have an easy and 
immediate 

Corollary 1. Suppose f : (C, 7) ^ (D,S) is a morphism of coalgebras and c G C. 
Then 

C 1=7 # /(c) hi </> 

for all formulas <f> G £{A4,A, k). 

We now turn to the second invariance property mentioned at the beginning of this chap- 
ter and show that bisimilar points satisfy the same sets of formulas. Although this es- 
sentially follows from Theorem 1, its importance warrants to state it as 

Theorem 2 (Bisimilarity Implies Logical Equivalence). Suppose (C, 7) and {D, b) 

are T coalgebras and the points c G C and d G D are related by a bisimulation. Then 

c h7 iff d\=s f 

for all formulas </> G £{Ai,A, k). 
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6 Expressivity 

This section shows, that the language C{M,AtK) also satisfies an abstractness con- 
dition in the sense, that under a natural completeness condition on the pair (A4,A), 
non-bisimilar points of the carrier set of coalgebras can be distinguished by formulas of 

£(M,A, k). 

For the proof we assume the existence of a terminal coalgebra, that is, of a great- 
est fixed point for the signature functor T. We represent the greatest fixed point of 
the signature functor T as limit of the so-called terminal sequence, which makes the 
succession of state transitions explicit. The categorical dual of terminal sequences is 
commonly used to construct initial algebras, see [2,19]. We use Theorem 2 of Adamek 
and Koubek [3], which states that in presence of a terminal coalgebra, the latter can be 
represented as a fixed point of the terminal sequence. Suppose for the remainder of this 
section, that T is an endofunctor on the category of sets. Ad is a set of natural relations 
for T and Al is a set of subsets of Tl. 

6.1 Complete Pairs 

It is obvious that we cannot in general guarantee that the language £(Ad , A, k) is strong 
enough to actually distinguish non-bisimilar points, since the set Ad might not contain 
enough relations or we do not have enough atomic propositions. We start by giving a 
completeness criterion on the sets Ad and A, which ensures that this does not happen. 

We write Sn(a) = {b G B j a R bj if H : A+^ B is a relation and a G A. We 
also denote the set of atomic propositions a G A satisfied by x G TXhy = 

{a G A \ T\x{x) G a\ifx G X. We shall abbreviate P_ 4 ,x to (or even to P) in the 
sequel. 

Definition 3 (Completeness of (Ad, Al)). 'We call the pair (Ad, Al) complete, if 
14 = {^{x'gTX\ §^(x)(x') = §^(X)(4} n n (T!)-I(a) 

AieAt aePAiA 

for all sets X and all elements x G TX. 

Intuitively, the pair (Ad, Al) is complete, if, given any set X, every element x G TX is 
determined by its /r(W)-successors and the atomic propositions which are satisfied by 
X. In case of the powerset functor, this amounts to the axiom of extensionality. 

A different way of understanding the completeness condition is by considering com- 
plete pairs as natural transformations T x V{A). Completeness then amounts to 

the fact that the induced natural transformation is “essentially injective”. Details can be 
found in [14]. We briefly note that the natural relations and atomic propositions defined 
in Section 3 give rise to complete pairs: 

Example 3 ( Complete Pairs). 



1. Consider the signature functor TX — V{L x X). If A4 = {p,[ |/GL}is the set 
of natural relations defined in Section 3 . 1 and Al = 0, then (Ad , Al) is complete. 
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2. Suppose TX = (O x X)^ + i? as in Section 3.2 and let M = {ni \ i G 1} and 
A = {Pb\ti \ e G E}U {p(i,o)|ri | (i, o) S / x O} be the set of natural relations 
and atomic propositions defined there, respectively. Then (A4 , A) is complete. 

It seems very hard to find a semantical characterisation of functors which admit a com- 
plete pair (A4 , A) of natural relations and subsets of T1 . However, it can be proved that 
the class of these functors contains the identity, constant and powerset functors, and is 
closed under small limits and small coproducts. For details, we refer to [14]. Note that 
the class of functors admitting a complete pair is not closed under composition. 



6.2 The Expressivity Theorem 

This section proves that £(AI, .4, k) is expressive enough to distinguish non-bisimilar 
points, subject to the completeness of (M,A) and the size of k. The cardinality of 
conjunctions and disjunctions needed to obtain expressivity is given in terms of the 
cardinality of the final coalgebra and the convergence of the terminal sequence. 

Before we state the expressiveness theorem, we briefly review the construction of 
greatest fixed points for set functors using terminal sequences. We only give a brief 
exposition, for details see the original paper by Adamek and Koubek [3] (or Worell 
[23] for a more categorical treatment). The terminal sequence of an endofunctor T on 
the category of sets is an ordinal-indexed sequence Za of sets together with functions 
/a ,/3 '■ Za ^ Zf3 for all ordinals (3 < a such that Zq = {*}, Za+i = T{Za) and 
Z\ = Lima<^\Za- It can be seen as the continuation of the sequence 



■T1 



Tti 



■tH 






TH" 



through the class of all ordinal numbers. Note that the terminal sequence generalises 
the construction of initial algebras and terminal coalgebras to functors, which do not 
preserve w-colimits (resp. u;°P-limits). It has been shown in [3], Theorem 2, that in pres- 
ence of a final T coalgebra, the terminal sequence converges (ie. there exists a (limit) 
ordinal a such that fa+i,a is an isomorphism) to the terminal coalgebra {Za,fa+i a)- 
If fa+i,a is an isomorphism, we say that the terminal sequence stabilises at a. We are 
now ready to state the expressiveness theorem: 

Theorem 3 {C{M,A, k) Has Characteristic Formulas). Suppose {M,A) is a com- 
plete pair, T admits a terminal coalgebra (Z, C, Z ^ TZ) and k is a cardinal such 
that 



- K> \Eifj,i^z){z)\for all z GTZ and p G Ai 

- K> |A4| and n > |.4| and 

- The terminal sequence for T stabilises at k. 

Then there is a formula 4>^ G L{AA,A, k) such that = {z} for all z G Z. 

Given z G Z, the proof defines a formula 4>^{a) for each ordinal a < k and z G Z with 
the property |(()^(q;)]x = ffa{{fK.,a{z)}) by “induction along the terminal sequence” 
{Za, fa,! 3 ) forT. The formulae))^ = </>^(k) then characterises 2 . The proof can be found 
in [14].’ 
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Some remarks concerning the conditions on the cardinal n in Theorem 3 are in order. 
Clearly, we need conjunctions and disjunctions over possibly all atomic propositions 
and modalities. The third condition is also very natural, since we build the characteristic 
formula step by step, until we reach the terminal coalgebra, ie. the index, where the 
terminal sequence stabilises. The only unintuitive condition is the first, giving a lower 
bound for k in terms of the final coalgebra. When looking at examples, one however 
notices that the restriction on the size of successors is very often already implicit in the 
signature functor T. One can for example show, that all polynomial functors T admit a 
set of natural relations Ad, such that for all sets X and all t G TX, the cardinality of the 
set of successors is at most one. Also, since we require T to have a terminal 

coalgebra, T cannot contain an unbounded powerset construction, hence the signature 
determines an upper bound of the set of successors in many cases. 

As a corollary we conclude that in presence of a terminal coalgebra, any two bisimi- 
lar points satisfy the same sets of formulas. Note that for the corollary to work, we need 
the signature functor T to preserve weak pullbacks, since otherwise also non-bisimilar 
points are identified in the terminal coalgebra. Since in cases, where the signature func- 
tor does not preserve weak pullbacks, bisimulation fails to capture the notion of be- 
havioural equivalence, we do not consider the restriction to weak pullback preserving 
functors as a defect of our theory. 

In cases where the signature functor does not preserve weak pullbacks, Kurz argues 
in [10], that observable equivalence is not captured by bisimulation as defined by Aczel 
and Mendler [1], and - in presence of a final coalgebra - one should consider two 
state bisimilar, when they are identified in the final coalgebra, a notion, which can be 
equivalently described using co-congruences. 

Corollary 2 (£(A4 , A, k) Is Adequate). Suppose T preserves weak pullbacks and the 
hypothesis of Theorem 3. If (C, 7) is a T -coalgebra and c G C, there exists a formula 
(ff G L{Xi,A, k) such that 

I<?^1(C.7) = {c gC \ ci^ c} 

(where d iff there is a bisimulation R C C x D such that c R d). 

Theorem 3 also allows us to derive a characterisation of coalgebraic bisimulation in 
logical terms. To this end, we denote by Th(c) = {f G C{M,A, k) | c |=.y </>} the set 
of formulas satisfied by a point c G C for a T-coalgebra (C, 7). 

Corollary 3 (Bisimulation Is Logical Equivalence). Suppose T preserves weak pull- 
backs and the hypothesis of Theorem 3. If (C,^) and (D,S) are T -coalgebras and 
(c, d) G C X D, then 

Th(c) = Th(d) ^ c^d 

(where again d iff there is a bisimulation R C C x D with c R d). 

1 Conclusions and Related Work 

We have exhibited two semantical principles which allow to use multimodal logics to 
specify bisimulation invariant properties of coalgebras for an arbitrary signature functor 
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T. The same issue has been addressed in [4,11,13,15]. We briefly compare the results 
presented in this paper to the contributions just mentioned. 

Regarding the work of Moss [13], it has already been pointed out that the construc- 
tion of the language used to formulate properties on state spaces of coalgebras is very 
general, and imposes few restrictions on the signature functor T. Since the construction 
of the language is carried out in the category of classes and set-continuous functions, T 
has to be set-based (ie. the action of T on classes has to be deflned by its action on sets). 
In order to obtain a characterisation result, the signature functor T is also assumed to 
be uniform, a condition, which also appears (in slightly different form) in [21,22]. Note 
that the deflning property of uniformity (taken from [21], section 5.5) is the existence 
of a natural transformation p ■. T ^ V o W, where T is the extension of T to the cat- 
egory of classes, V is the powerset functor and W maps a class C to the carrier of the 
7^-algebra free over C. Hence T can be embedded into a powerset construction, but it 
in general this does not seem to imply that T can be embedded into a product Haoi ^ 
of the power set functor for a fixed cardinal k. It remains as open question, whether in 
presence of an accessibility condition on T, such an embedding can be obtained, which 
would also lead to a better semantical characterisation of the class of functors, which 
admit complete pairs. 

We turn to the work of Baltag [4], where a logical characterisation of simulation 
is given by extending a set functor T to a relator, that is, to an endofunctor Rel(T) : 
Rel ^ Rel on the category of sets and relations. Baltag argues, that different extensions 
of T to a relator give rise to different notions of simulation, including bisimulation, 
which is captured by extending T to a strong relator. The logical language used to 
obtain a characterisation of (various notions of) simulation is similar to that used in 
[13]. One of the main goals of the present paper was to obtain languages, which (only) 
characterise bisimulation. In case the signature functor T preserves weak pullbacks, 
it is shown in [5] (which is also used in [18] giving - to the authors knowledge - 
the first characterisation of bisimulation in terms of relators) that T can be uniquely 
extended to a strong relator Rel(T). In this case, natural relations can be equivalently 
described as natural transformations Rel(T) oX ^ I, where I : Set — > Rel is the 
canonical embedding. While this reformulation does not seem to simplify our treatment 
of coalgebraic modal logic, it would be interesting to see, whether replacing the strong 
relator Rel(T) by a different extension of T to a relator, the languages constructed in 
this paper give also rise to a characterisation of the different forms of simulation as 
discussed in [4]. 

The work of [9,1 1,15] focuses on an inductively deflned class of functors, and the 
languages considered there are built by induction on the structure of the signature func- 
tor. We have shown in [14], that most of the functors considered in these approaches 
admit a complete pair. The notable exception are functors which contain more than 
one “occurrence” of the powerset functor V, for example TX = V{A x V{B)). 
The logic described in [15] admits a characterisation result even for those functors, 
but at the expense of a language constructed by an iteration of inductive definitions. 
That is, at every “occurrence” of the powerset functor, one has to close the language 
constructed so far under propositional connectives and modalities and uses the set 
thus obtained as the base case for a new inductive definition. This technique could 
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be mimicked in the framework of natural relations by considering a chain of relations 
T = Tk ^ Tk-i ^ To = Id, where each set of relations Tj Tj_i enjoys 

a completeness property. Looking at examples, the approach seems promising, but we 
have not yet worked out the details which then would lead to a more general theory. 

Finally, we would like to comment on the predicate liftings used in [9]. By an easy 
inductive argument, one can see, that the “paths to identity” used in loc. cit. in order 
to obtain modal operators give rise to natural relations T-h- Id. On the other hand, 
every natural relation /x determines a pair of predicate liftings 3^ and V^. Here we use 
the term “predicate lifting” in the general sense, indicating a natural transformation 
2 — > 2 o T (2 denotes the contravariant powerset functor) in contrast to [9], where 
one associates a fixed predicate lifting to each functor T by induction on its syntactical 
structure. It should also be noted that from a logical perspective, the interpretation of 
the modal operator associated to the predicate liftings 3^ and coincides with the 
interpretation of the existential and universal modality <>^ and induced by a natural 
relation ^ : T-h- Id. It thus seems, that predicate liftings also give rise to logics for 
coalgebras, but expressiveness results are probably more difficult to obtain, since one 
can not argue in terms of successors any more (as we did in the proof of Theorem 3). 
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Abstract. We show that the language of pictures over {a, 6} (with a 
reasonable relation between height and width), where the number of a’s 
is equal to the number of fe’s, is recognizable using a finite tiling system. 
This means that counting in rectangular arrays is definable in existential 
monadic second-order logic. 

Classification: Automata and formal languages, logic. 



1 Introduction 

In [GRST94] pictures are defined as two-dimensional rectangular arrays of sym- 
bols of a given alphabet. A set (language) of pictures is called recognizable if it 
is recognized by a finite tiling system. It was shown in [GRST94] that a picture 
language is recognizable iff it is definable in existential monadic second-order 
logic. In [Wil97], it was shown that star-free picture expressions are strictly 
weaker than first-order logic. The context-sensitive languages are characterized 
in [LS97a] as frontiers of picture languages. In the same spirit, a link to compu- 
tational complexity is established in [Bor99], where NP is characterized with the 
notion of recognizability by padding 1-dimensional words with blanks to form 
an n-dimensional cube. 

A comparison to other regular and context-free formalisms to describe picture 
languages can be found in [Mat97,Mat98] . Gharacterizations of the recognizable 
picture languages by automata can be found in [IN77] and [GR96], where also 
the subclasses, which are defined by a restriction from nondeterminism to deter- 
minism or unambiguity, are considered. 

In the one-dimensional case, counting is a kind of a prototype concept for 
non-recognizability (= non-regularity), but spending one extra dimension easily 
enables counting (see Section 2) for one line. But it was so far conjectured 
that counting cannot be done for 2 dimensions without having an extra third 
dimension available. In [Rei98], the author could only find a nonuniform method 
for simulating a counter along a picture, which just showed why the attempts, 
which had been made to disprove Theorem 5, had failed. 

1.1 Preliminaries 

Definition 1. [GRST94] A picture over S is a two-dimensional array of ele- 
ments of S. The set of pictures of size (m,n) is denoted by A™’". A picture 
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language is a subset of S*’* := IJ^ jjm,n^ 
For a p G we define p G jjm+2,n+2 ^ 

adding a frame of symbols ff ^ S. 

Let Tm,n{p) be the set of all sub-pictures of p 
with size (m, n). 
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A picture language L C F*'* is called local if there is a A with L = {p G 
F*’*\T2.2{p) C A}. A picture language L C F*'* is called hv-local if there is a A 
with L = {p € F*’*\Ti^2{p)'LT2^i{p) C A}. A picture language L C S*’* is called 
recognizable if there is a mapping n : F ^ S and a local language L' C F*'* 
with L = 7t(L'). 



This means that in order to recognize a picture, we have to find (non-determi- 
nistically) a pre-image in the local pre-image language. According to [LS97b], 
L is recognizable if and only if there is a mapping t: : F —>■ S and a hv-local 
language L' C F*'* with L = This means we can use a hv-local pre-image 

language, as well. 

A necessary condition for recognizability, reflecting that, at most, an expo- 
nential amount of information can get from one half of the picture to another, 
is the following: 



Lemma 2 . [Mat 98 ] Let L C F*’* be recognizable and {Mn C T"’* x T"’*) be 
sets of pairs with Vn,V(/,r) G Mn Ir G L and 
y{l,r) + {l',r') G Mn Ir' ^ L or I'r L. 

Then \Mn\G 



Let L be the set of pictures over {a, b}, where the number of a’s is equal to 
the number of b's. Considering pictures where the width is f{n) ^ for the 

height n, we can find /(n) pairs (li,ri), (12,^2), (^/(n); ?'/(n))) such that k has 

i more a’s as b’s and rj has i more b’s as a’s for all i < f{n). Thus kri is in L 
but all the brj with i j have a different number of a’s and b’s and are, thus, 
not in L. By contradiction of /(n) ^ with Lemma 2, we get the following: 

Corollary 3 . The language of pictures over {a,b}, where the number of a’s is 
equal to the number of b’s (and where sizes {n,m) might occur, which do not 
follow the restriction m < f{n) or n < f{m) for a function f G 2 '^^"^^ is not 
recognizable. 




To formulate the main result (Theorem 5) of the paper, we need the following 
definition: 



Definition 4 . The picture language L- (resp. Lf) is the set of pictures over 
{a, 6} (resp. {a,b,c}), where the number of a’s is equal to the number of b’s and 
having a size (n,m), with m < 2" and n < 2’". 
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Remark: We could as well use any other constant base k instead of 2, which 
means that there is no gap to Corollary 3. 

Theorem 5. The languages and L'L are recognizable. 



1.2 Overview 

The next section will show how easy it is if we only have to count the difference 
of a’s and &’s in the bottom line. The problem which arises in the general case 
is that the counter is not able to accept simple increments at any local position. 
Section 3 reduces to the problem of counting only a’s and 6’s in odd columns 
and rows. 

The essential idea of the proof of Theorem 5, in order to overcome this prob- 
lem, is to construct some ’counting flow’, which has small constant capacity at 
each connection leading from one position to its neighbor. The connections have 
different orders, which are powers of 4. For example, a flow of 7 in a connection 
of order 4® represents a total flow of 7 • 4®. Similar to the counter used in [Fiir82], 
the number of occurrences of a connection with order 4® is exponentially decreas- 
ing with i. Since the order can not be known on the local level (the alphabet is 
finite but not the order), some ’skeleton structure’ must describe a hierarchy of 
orders which gives the information at which positions some counted value can be 
transfered from one order to the next. If, for example, a row having the order 4® 
crosses a column having the order 4®“''^, the flow in the row may be decreased by 
4 simultaneously increasing the flow in the column by 1, which preserves the to- 
tal flow. This skeleton-language is described in Section 4. Section 5 describes the 
counting flow for squares of the power of 2 using a variation of the Hilbert-curve. 
Section 6 shows the generalization to exponentially many appended squares by 
combining the techniques of Sections 2 and 4 and the generalization to odd cases 
by folding. 

2 Simple Counting for One Line 

This section can be viewed as an exercise for Section 5. Here, we consider the 
language of pictures over S = {a, b, c} with an equal number of a’s and 6’s at 
the rightmost column and the rest filled with c’s (See for example, 7r(p) below). 

We use a local pre-image language describing a flow. It is defined over the 
the alphabet T = { — 1,0,1}^, where the numbers in (l,r,u,d) describe the flow 
going out of the position to the left, the right, up and down. In the graphical 
representation, we describe this by arrows. The sources of the flow correspond 
to the a’s, which is described by 7r((l, 0, 0, 0)) = 7 t(Q) = a. Analogously, the 
6’s are the sinks: 7r((— 1, 0, 0, 0)) = 7r(|£]) = 6. Everywhere else, the flow has to 
be continued, which is expressed by n{{l, r,u,d)) = cif2l + r + u + d = 0. The 
main point is that the flow to the left side has the double order. This means 
that flows from the rightmost column to the second rightmost column and flows 
within the second rightmost column have order 1 and, in general, flows between 
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the t-th rightmost column and the i + 1-th rightmost column and flows within 
the i+ 1-th rightmost column have the order 2*“^. For example, we may use the 
pre-image 



for 7 t(p) = 

Although there might be several possible pre-images, one of them can be 
obtained in the following way: For a and b, there is only one possible pre-image. 
The pre-image (l,r,u,d) for c is chosen by taking r := I' for the right neighbor 
(F, , , ) and u := d! for the upper neighbor (, , , d') (u := 0 if the upper neighbor 
is #). If r -I- u = 2 (resp. —2), let I := —1 (resp 1) and d := 0. If r -|- u = 1 (resp. 
— 1), let d := —1 (resp 1) and I := 0; else d := I := 0. 

In this way the flow from one row down to the next row corresponds to the 
binary representation of the difference in the number of a’s and &’s so far. A flow 
to the left corresponds to a carry-bit. 

The formal definition for A is 
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7 = (l,r,0,d)} U {I 



7 = {l,r,u,0)} 



7 = (0,r,u,d)}U{m^|7 = (^0,0,0),/e{l,-l}} 



Remark: We could as well use F = {1 — k , ..., k — 1}'^ and n{{l, r, u, d)) = c for 
fc'Z-|-r-|-'u-|-d=0tobe able to treat pictures of size (n, m) with m < fc”. 



3 Reduction to Odd Positions 



We view a mapping e : S ^ as lifted to map a picture of size (m, n) over 
A to a picture of size {im,jn) over Eg- 

Lemma 6. A picture language L over E is recognizable if e{L) is recognizable 
and e is injective. 
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Proof. Let e{L) be recognizable by a mapping Tie ■ Pe and a tiling C 

Construct P := {g G Pf^ | 3s G A’ e(s) = 7re((/)}, tt \ F ^ E with 
^{ 9 ) = s for e(s) = TTeig) (e injective) and A := {p G {Pg U | p G 



(ru{#})2.2,T2,2(p) CZ\,U{I 

the picture of size (i,j) consisting only of ff. 



# 

^J}} where, for simplicity, we identify with 
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for X G {a, b, c}. 
is recognizable, it 



We use e : {a,b,c} 1-^ {a, &, c, with e{x) = 

In order to show that LI and, thus, L= = LL fl {a, b} 
remains to show in Lemma 10 that e{Lf) is recognizable, which means that 
for Lp in Lemma 8 we only have to care about a’s and &’s on positions with a 
odd row and column number by intersecting L p with the recognizable language 
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d 
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X G {a, b, c}}* 



4 The Skeleton for Squares of the Power of 2 



The skeleton describes a square, where each corner is surrounded by a square 
of half the size. The skeleton is described as a hv-local language Lp over 



the alphabet Li's = {q , r, "T , t", q ^ 

|,'f, /f, — ,^,^}. Since the last section we are particularly inter- 
ested in the intersection Ls H Lp := {p\,p 2 , •■•} with the recognizable language 
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-}}*’*. The first 4 examples are 



P3 = 



# 


# 


# 


# 


# 


# 


# 


# 


# 


# 


# 


r 




n 




r 




n 




# 


# 




r 














# 


# 


L 


+ 


J 




L 




J 




# 


# 








r 


— 








# 


# 


r 




n 






r 




n 




# 


# 




i- 






h 


+ 


A 






# 


# 


L 


-l^- 


J 


t 


L 


-l^- 


J 




# 


# 
















r 


# 


# 


# 


# 


# 


# 


# 


# 


# 


# 


# 



and 








First we will show by induction that for every i, a picture pi of size (2*, 2*) 
is in Ls n Lji'. 



Consider as the base of induction and p 4 as an example for a step of 
induction. Except from the right and lower edge where the picture meets the 
#’s of the frame, As has all the symmetries of a square. The upper left quarter 
of pi+i is exactly Pi. Furthermore, the 3 sub-pictures of size (2* — 1, 2*— 1), starting 
withpi+i(l,2*-|-l) = r, Pi-i-i(2*-l- 1, 1) = r and pi+i(2* -I- 1, 2* -I- 1) = r, are 
rotations of the sub-picture of size ( 2 * — 1 , 2 * — 1 ), starting with Pi(l, 1 ) = r 
around pi+i(2*,2*) = f - Now, As allows us to continue the 2*-th row after f 
with — ’s until the ^ at the column with the only 'I at the lower edge of the 
upper right sub-picture. i,From now on, continue with -^’s until the at the 
last column, which had .’s so far and continues with :’s until the t" in the lower 
right corner. (Column 2* and last row analogously). 

The opposite direction, that for every i, exactly one picture pi of size (2*, 2*) 
is in L 5 n Lfi, follows (considering the only possibility for the right and lower 
edge) from the following lemma: 
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Lemma 7. For every picture p G Ls F L_r and for every i > 1, r,c > 0, the 
sub-picture q of size (2* — 1,2* — 1) starting oi p(c2* + 1, r2* + 1) has the following 
shape at the outside: The upper row is periodically filled by q{t, 1) = r (resp. 

n y') if t Mod A = 1 (resp. 2,3,0) and t < 2*. The same holds in a 90°- 
rotation symmetric manner. Two exceptions to this are that for even (resp. odd) 
r we have (7(2*“^, 2* — 1) = (resp. g(2*“^, 1) = /fv j and for even (resp. odd) 
c we have <7(2* — 1, 2*“^) = ^ (resp. <7(1, 2*“^) = -^) if i > 2 and that for even 
(resp. odd) r we have (7(2*“^, 2® — 1) = ^ (resp. q(2*“^,l) = and for even 
(resp. odd,) c we have (7(2* — 1,2*“^) = ^ (resp. q{l,2^~^) = if i = 2. 

On http : //www-f s . informatik.uni-tuebingen.de/~reinhard/picgen.html 
is a Java program, which demonstrates that in most cases, the symbol is deter- 
mined by its left and upper neighbors; a wrong choice in the remaining cases will 
sooner or later result in an unsolvable case at another position. The program is 
interactive and could help the reader to understand the following proof. 

Proof. For each induction step we need the the following additional Claim C: 

All left neighbors of <7(1, j) with 1 < j < 2® are in I , 'I , /f' } with 

the only exception that the left neighbor of (7(1,2®“^) is in {A-, vfr, A/-, 

"f , if c is odd. Analogously all upper neighbors of q{j, 1) with 1 < 
j < 2® are in {#,-,••, — , with the only exception that the upper 

neighbor of (7(2®“^, 1) is in { <|; , if , '| } if r is odd. 

Base of induction for i = 2: Assume furthermore, by induction on r and c, that 
Claim C holds (The neighbors are # for c = 0 resp. r = 0). Because of Ln we 
have (7(1, 1) G {-] , r, }. Since (7(1, 1) = -| would, by Z\s, require the left 

neighbor to be in {-V-, 7(r}, and since <7(1, 1) G { would by As require 

the upper neighbor in { ^ , ^}, we have (7(1, 1) = r which is the only 

remaining possibility for a position with both odd coordinates. Consequently 
considering the upper neighbor, q{2, 1) = -V- (resp. = ffr) if r is even (resp. odd) 
and (7(1, 2) = ^ (resp. = if c is even (resp. odd). Each of the 4 combinations 
forces <7(2,2)=-t, T", ^or-^. The two ends of (7(2, 2) point to the exceptions 
of the outside shape. Furthermore, q{3, 1) = -| , <7(1, 3) = Thus one of the 4 
combinations of <7(2, 3) = Ar (resp. = ^) if r is odd (resp. even) and q{3, 2) = -f 
(resp. = ^) if c is odd (resp. even) and q{3,3) = . 

Right neighbors of q{3,l) = q , q{3,2) = and <7(3,3) = must be in 
I , 'I , /| } which proves Claim C for c J- 1 if c is odd. If c is even, the right 
neighbor of <7(3,2) = ^ is in {Ar, G(r, AG, which proves Claim C for 

c + 1 . The same holds for r J- 1 analogously. 

Step from t to i + 1: Assume furthermore, by induction on r and c, that Claim 
C holds for i J- 1. (The neighbors are # for c = 0 resp. r = 0). By induction 
on i we have that each of the 4 sub i-pictures of the i + 1-sub-picture <7 has its 
exceptional side hidden inside <7. Since <7(1, 2® — 1) = considering the possible 
left neighbors leads to <7(1,2®) = • if i J- 1 is even, resp. <7(1,2®) G 
if t + 1 is odd. The periodical contents of the rows 2® — 1 and 2® J- 1 only 
allows us to continue row 2® with the same symbol until column 2®“^, where 
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q( 2 *“^, 2 *) G { ^ , ^ , ^}. This allows us only the combination g(l, 2 *) = 

... = (7(2*“^ — 1,2*) = • and 9(2*“^, 2*) = ^ if t + 1 is even, resp., <7(1,2*) = 
... = <7(2*“^ — 1,2*) = ^ and <7(2*“^, 2*) = ^ if t + 1 is odd, which has to be 
continued with <7(2*“^ + 1 , 2 *) = ... = <7(2* — 1 , 2 *) = ••, resp., — . Depending on 
the analogous column 2 *, we get <7(2*, 2 *) G { t”, ^}, resp., <7(2*, 2 *) G {-t , "^ } 
and, further, have to continue with <7(2* + 1, 2*) = ... = q(2* + 2*“^ — 1, 2*) = — , 
resp., -, <7(2* + 2*-\2*) = resp., and with <7(2* + 2*"i + 1,2*) = ... = 
^(2*+i — 2 *) = ^ if t + 1 is even, resp., • if i + 1 is odd. Together with the 

analogous column 2*, this completes the description of q. 

The right neighbor of <7(2*+^ — 1 , 2 *) = ^ (resp. •) must be in in 
"f , if c is even, resp. | , , /jv } if c is odd which proves Claim C for 

c+ 1 . The same holds for r + 1 analogously. 



5 The Counting Flow for Squares of the Power of 2 



Lemma 8. e(L'L) fl (J^ j; 2 *, 2 ‘ is recognizable. 

Proof. We define a language Lp in the following and show e(Tl) fl (J^ 27 ^ = 

Lp n e{{a, b, c, }*’*). We give each flow from one cell to its neighbor a capacity 

from -9 to 9 by defining Ep := Es x {— 9 , — 8 , ..., 9 }^. Furthermore, we allow 

only those symbols (x,l,r,u,d) G E fulfllling: 

Tr{x,l,r,u,d) := a if X G {-\ , r, '“,“'}A/ + r + u+(i=l, 

Tr{x,l,r,u,d) := b if X G {-\ , r, } A / + r + m + d = — 1, 

Tr{x,l,r,u,d) := c if X G {-\ , r, '“,“'}A/ + r + u + (i=0, 

7r(a;, l,r,u,d) := d if x G {-f , t”, ^,"^}A/ + r + M + d=0, 

or a; G | , t , /f , — , Al = -r Au= -d, 

or X G {AG, Gr, A)r} A I + r + 4u + 4(i = 0, 
or cc G { ^ , ^ A 41 + 4r + M + d = 0. The tiling 
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U {I 
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G As J= ifi,l,r,u,d),g = (51, -d, d')} 
& As,f = {fi,l,r,u,d),g = {gi,-r,r' 



.fl 
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u',d')} 



takes care of the continuation of the flow (additionally, we have the tiles with 
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Here the tile -f ^ is depicted as 


1 fl r 


-r 9 i r' 
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d' 



illustrating that 

the flow r going out of / in right direction is the same as going in g from the 
left. The symbols n , r, allow sources and sinks of the flow; they only 

occur in odd rows and columns and, therefore, have the order 1; -T , f , 
occur where the flow in the column and the row have the same order; 

, 1, i , I occur where the flow in the column and the row have 

a completely different order and AG,Gr,A[^,Afr, occur where a 

rectangle or its elongation meets a rectangle of half the size, which means that 
a carry of the counter can take place here. Examples are 
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In general, for any j and i, the -2* -th row, resp., column have the order 

4* (if they are in the picture). The symbols in{-|, r, 

occur where a 2*“^ + j • 2* -th row crosses a 2*“^ + k ■ 2'^ -th column having the 
same order. The symbols in {^, Ar, 7|e} occur where a 2*“^ -I- j • 2* -th row 
crosses a 2* -|- /c • 2*+^ -th column having the fourfold order ( ^ vice 

versa). Thus, from the existence of a flow follows that the number of sources and 
sinks (a’s and b’s) must be equal. A picture in e({a, b, c, }*’*) has its pre-images 
in Lji and, thus. Lemma 7 makes sure that the projection to the first component 
in the 5-tuples has the correct structure, which means that LFne({a, b, c, }*’*) C 

For the other direction, we have to show that for any picture of size (2*, 2*) 
with an equal number of sources and sinks (w.l.o.g. on positions on odd rows 
and columns), we can construct at least one corresponding flow: 

Here we use the Hilbert-curve, where each corner point (2*“^ + jx ■ 2*, 2*“^ -|- 
jy ■ 2*) of the curve having order 4* is surrounded by 4 corner points (2*“^ + jx • 
2* ± 2*“^, 2*“^ + jy ■ 2* ± 2*“^) of the curve having order 4*“^ (see also [NRS97]). 




A curve of each order uses 3 lines of a square and then one elongation line to 
get to the next square. In this way at least one of the 3 lines crosses the curve 
of the next higher order. (If it crosses it a second time, we ignore the second 
crossing in the following consideration.) Now we construct the flow along the 
curve according to the following rules: If a flow of more than 3 (resp. less than 
-3) crosses a flow of the next higher order or if a flow of more than 0 (resp. 
less than 0) crosses a negative (resp. positive) flow of the next higher order, it is 
decreased (resp. increased) by 4 and the flow of the next higher order is increased 
(resp. decreased) by one. We may assume by induction that a curve has a flow 
G [—6,6] as it enters a square. After at most 3 times crossing the curve of the 
next lower order (which could bring the flow for example to -9 or 9), it will cross 
the curve of the next higher order, bringing the flow to [—5, 5]. Since we have to 
consider 4 crossings in the square, the first condition of the rule makes sure that 
the curve also leaves the square with a flow between -6 and 6 and, thus, never 
exceeds its capacity. The second condition of the rule makes sure that a small 
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total flow will And itself represented in curves with low order, which is important 
towards the end of the picture. 

Example: 

b b b b 
a a a a 

^ a a a b 

In the sub-picture 7r(p) = e( | | ^ | ^ | ) = 

(last line of d’s omitted) the difference of a’s and 6’s is 2. Assume for example 
that the difference of a’s and 6’s above 7r(p) is 16=4-3-1-4 which is represented 
by a flow of 4 with order 1 and a flow of 3 with order 4 is entering the following 
pre-image p from above at column 1 and 2. Then the total difference of 18=16-1-2 
is represented by a flow of 1 with order 16 and a flow of 2 with order 1 leaving 
p to the right side. 





6 The Generalization 

Lemma 9. e(Ll) fl IJ ■ ■ is recognizable. 

“ i zt i :J_ 

Proof. Adding the tiles and ^ to As allows skeleton- 

pictures of size (j2*,2*) for any j > 0. 

The counting flow described in the previous section can be continued from 
one square to the next, as the following picture illustrates: 
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But what can we do with a flow (of order 4*) reaching the bottom line? 
The idea is to combine the method with the method of Section 2 by using 
Ef. := Ep X { — 1, 0, 1}"* as a new alphabet and designing in such way that a 
transfer of the flow from one system to the other is allowed at those symbols f 
and "t , which occur at the bottom line. The r-th row (from the bottom) can 
now have flows of the order 4* • 2'’“^. This allows us to represent numbers up to 
4* • 2^ and, thus, recognize pictures of size (j2*,2*) for any 0 < j < 2^ (resp. 
) with the same number of a’s and b’s. 

Lemma 10. e(LT) is recognizable. 

Proof. For the general case of pictures of size (m, n) where we assume w.l.o.g. 
m > n, we choose the smallest i with 2* > n and the smallest j with j2* > 
m. Then, since 2n > 2* and 2m > j2*, a picture of size (j2*,2*), which was 
recognized with the method described so far, can be folded twice and we get the 
size (m,n). This folding can be simulated using an alphabet Eg := Ef, where 
the first layer corresponds to the picture and the other 3 layers may contain the 
simulated border consisting of and parts of the flow but no sinks and sources 
(this means only c’s and d’s). Ag simulates A^ on each layer by additionally 
connecting layer 1 with 4 and 2 with 3 at the top border and 1 with 2 and 3 
with 4 at the right border. 



2 * 2 * 




n 



Remark: In the same way, we can further generalize this counting method to 
n-dimensional recognizable ’picture’-languages by folding into n dimensions 
like in [Bor99] . 

7 Outlook 

It remains open to And a deterministic counting method on pictures. Obviously, 
this can not be done using the deterministic version of on-line tessalation accep- 
tors [IN77] as a model, since the automaton can not handle the number occurring 
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in the last line. But a good candidate is the notion of deterministic recognizabil- 
ity in [Rei98]. At least in the case of squares of the power of 2, a construction 
of the skeleton along the Hilbert-curve should be possible, but working out the 
details will be hard. 
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Hierarchies of Regular Star— Free Languages* 
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Abstract. We propose a new, logical, approach to the decidability prob- 
lem for the Straubing and Brzozowski hierarchies based on the preser- 
vation theorems from model theory, on a theorem of Higman, and on 
the Rabin tree theorem. In this way, we get purely logical, short proofs 
for some known facts on decidability, which might be of methodological 
interest. 

Our approach is also applicable to some other similar situations, say to 
’’words” over dense orderings which is relevant to the continuous time 
and hybrid systems. 

Keywords: Star-free regular languages, hierarchies, definability, 
decidability. 



1 Introduction 

In automata theory, several natural hierarchies of regular languages were studied. 
Among the most popular are hierarchies of Brzozowski and Straubing [Pin86], 
both exhausting the regular star-free languages. A natural question about these 
hierarchies is formulated as follows: given a level of a hierarchy and a finite 
automaton, one has to decide effectively whether or not the language of the 
automaton is in the given level. Till now, this question is solved positively only 
for lower levels. For higher levels, the problem is still open and seems to be hard 
(see e.g. [Pin86, Pin94] for more information and references). 

In the literature one could identify at least two approaches to the decidability 
problem, which might be called algebraic and automata-theoretic. The first ap- 
proach exploits the well known relationship of regular languages to semigroups, 
the second one tries to find a property of a finite automaton (usually in terms 
of so called forbidden patterns) equivalent to the property that the language 
recognized by the automaton is in the given level. 

In this paper, we propose another, logical, approach to the problem. From 
[Th82, PP86] it follows that the problem is similar in formulation to some tra- 
ditional decidability problems of logic. Our main observation is that one can 

* Partly supported by the Alexander von Humboldt Foundation, by a grant of the 
Russian Ministry of Education and by RFBR Grant 00-01-00810. 



A. Ferreira and H. Reichel (Eds.): STAGS 2001, LNCS 2010, pp. 539—550, 2001. 
@ Springer- Verlag Berlin Heidelberg 2001 
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apply in this situation some old facts known as preservation theorems (see e.g. 
[Ro63, Ma71]), as well as a theorem of Higman [CKa91]. Observing that the 
corresponding conditions are interpretable in the Rabin tree theory, we get new, 
purely logical and short proofs of some known facts on decidability. This might 
be of methodological interest. Our approach is applicable also to some other 
similar situations yielding several new results. 

The rest of our paper is organized as follows: in Section 2 we consider some 
versions of the Straubing hierarchy, in Section 3 some versions of the Brzozowski 
hierarchy, in Section 4 we discuss the role of the empty word and relationships 
of our versions to the original Straubing and Brzozowski hierarchies, in Section 
5 we discuss some relevant results and possible future work. 

We close this introduction with reminding notation used throughout the pa- 
per. Let A be an alphabet, i.e. a finite nonempty set. Let denotes the 

set of all words (resp., of all nonempty words) over A. As usual, the empty 
word is denoted by e, the length of a word u by |t6|, and the concatenation of 
words u and v by uv. Concatenation of languages X, Y is denoted XY. For 
u = uo ■ ■ ■ Un € A+ and i < j < n, let u[i, j] denote the segment (or factor) of u 
bounded by i,j (including the bounds). 

2 Straubing- Type Hierarchies 

A word u = uq . . .Un G A+ may be considered as a structure u = ({0, . . . , n}; < 
, Qa, • . •), where < has its usual meaning and Qa{a G A) are unary predicates on 
{0, . . . ,n} defined by Qa{i) ^ ui = a. As is well known (see e.g. [MP71, Th82, 
PP86]), there is a close relationship between star- free languages and classes of 
models u of sentences of signature ct = {<, Qo, ■ • •} (in this section, ’’sentence” 
means ’’first order formula of signature cr without free variables”; the only excep- 
tion is the proof of Theorem 2.1 below where we need also sentences of another 
kind). Note that the alphabet A is fixed throughout the paper, hence we omit 
the alphabet from our notation. 

Let us consider a first order theory CLO of signature a that is closely related 
to the theory of regular languages. The axioms of CLO state that < is a linear 
ordering and that any element satisfies exactly one of the predicates Qo(a G A). 
Models of CLO are called colored (more precisely, A-colored) linear orderings. 
We use letters like u, v, . . . (respectively, U, V, . . .) to denote finite (respectively, 
countable) models of CLO. As usual, U C V denotes that U is a substructure 
of V. For a sentence 4>, let be the set of all countable models of CLO 
satisfying (/>, in symbols = {U|U \= </>}. Note that any finite model of 
CLO is isomorphic to a structure u from the preceding paragraph, for a unique 
u G A+. The relation C induces a partial ordering on A+ that will be denoted 
by the same symbol. 

For n > 0, let 27° denote the set of all sentences in prenex normal form 
starting with the existential quantifier and having n — 1 quantifier alternations. 
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Let Sn be the set of sentences equivalent to a if^-sentence (modulo theory CLO). 
In other words, Sn = Let Sn be the dual set for Sn, 

i.e. the set of sentences equivalent to negations of S'„-sentences. Let B{Sn) be 
the set of sentences equivalent to a Boolean combination of If^-sentences. Then 
we have the following assertions. 

Lemma 2.1. (i) For any n > 0, B{Sn) = S'n+i H Sn+i- 

(ii) 0 G S'! zjff VU h 0VV D U(V h </>)• 

(Hi) 4> G S 2 iff the union of arbitrary chain Uq C Ui C • • • of models of 0 is 
a model of 0. 

(iv) 0 G S '2 tjff VU h C UVV(u CVCU^Vh<(')- 

Proof. (i) — (iii) are well known results of logic (see e.g. the ’’preservation the- 
orems” from [Ro63, Ma71, Sh67, CKe73]), while (iv) easily follows from (iii). 
Namely, if a sentence 0 = 3xVy'0(x, y), where 0 is a quantifier-free formula, is 
true in U, then let u be the substructure of U with the universe {xi, . . . ,x„}, 
where x = {xi , . . . , cc„). Then u clearly satisfies the condition VV(u C V C U ^ 
V ^ 0). Conversely, assume the righthandside condition of (iv) and prove that 
0 G S' 2 . Suppose the contrary; then, by (iii), there is a chain Uq C Ui C • • • of 
models of ^0 the union U of which satisfies 0. Let u be a finite substructure of 
U satisfying VV(u C V C U — > V )= 0). Choosing a number i with u C U^, 
one gets a contradiction (take in place of V).This completes the proof. 

Let {Dk}k>o be the difference hierarchy (known also as the Boolean hierar- 
chy) over Si- Hence, Dq is the set of false sentences, Di = Si, 02 ( 03 , 04 ) is 
the set of sentences equivalent to sentences of the form 0o A ^0i (respectively, 
(00 A->0i) V02, (0oA^0i) V(02 A->03)) and so on, where fn G S') (for more infor- 
mation on the difference hierarchy see e.g. [Ad65, Se95]). An alternating chain 
for a sentence 0 is by definition a sequence Uq C • • • C of CLO-models 
such that Ui 1= 0 iff U^+i \= ^0; k is called the length of such a chain. Such a 
chain is called a 1-alternating chain, if Uq ^ 0. One could consider also infinite 
alternating chains (with order type u>). 

The next assertions are also known in a more general form [Ad65, Se91]. 
Lemma 2 . 2 . (i) For any k, Ok LI Ok Ok+i- 

(ii) LkOk = B(Si). 

(iii) 0 G Ok iff there is no 1-alternating chain for 0 of length k. 

We are ready to prove one of our main results on the decidability of some 
classes of sentences introduced above. 

Theorem 2 . 1 . The classes of sentences S\, S 2 , B(Si), Ok(k > 0) are decidable. 
Proof. Let T = {0, 1}* and let ro, ri be unary functions on T defined by ri(u) = 
ui(i < 1). According to the celebrated theorem of M. Rabin [Ra69], the monadic 
second order theory S2S of the structure (T;ro,ri) is decidable. We shall use 
this fact in the following way: for any set C G {S\,S 2 , B(Si),Ok\k > 0} and for 
any a-sentence 0 one can effectively construct a monadic second order sentence 
0 of signature {ro,ri} such that 0 G U iff 0 gS 2S (the monadic sentence 0 is 
called the interpretation of the sentence 0). This is obviously enough. 
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We will use some well known facts on definability (by monadic second order 
formulas) in (T; ro, ri) established in [Ra69]. First recall that the lexicographical 
ordering ^ on T is definable. Let B QT he the set of all sequences xlOl having 
no subsequence 101 except one at the end. Then B is definable and {B; has 
the order type of rationals. This implies that any countable linear ordering is 
isomorphic to an ordering of the form (U ; with U B. Hence, any countable 
model of CLO is isomorphic to a structure of the form U = {U]<,Qa, ■ ■ ■) 
with U C B and Qa C U for a G A (in this proof, we call such structures 
inner structures). In the monadic logic, one can use variables for subsets of 
T and even quantify over them. Hence, it is possible to speak about arbitrary 
inner structures. We can also speak about substructures because for any abstract 
models U and V of CLO, U is embeddable in V iff there are inner models {U;^ 
, Qa, ■ ■ .) and (V; Qa, ■ ■ .) isomorphic to U and V, respectively, and satisfying 
t/ C V and Qa C Q'J)a G A). 

Note also that for any fixed cr-sentence ip the set of all inner structures U 
satisfying ip is definable (i.e., if ip is Vx3j/(x < y A Qa{y)) then U ^ f/' iff 
Vx G U3y G U{x ^ y AQa{y))). In particular, the set of all inner models of CLO 
is definable. 

Now let us return to the proof of the theorem. Let e.g. C = Si and (p be 
a given cr-sentence. Let </> be a monadic sentence expressing that for any inner 
model U of CLO satisfying (p and any inner model V of CLO extending U, V 
satisfies p. By Lemma 2.1 and remarks above, p G Si iS p G S2S. This completes 
the proof for the set Si. Remaining cases are treated in the same way (in the 
case of S 2 one shall note that the class of finite subsets of T is also definable 
[Ra69]). This completes the proof. 

Remark 2.1. The proof implies the known fact that the monadic second order 
theory of countable models of CLO is decidable. 

Theorem 2.1 demonstrates ideas of our approach for a decision problem tradi- 
tional for logic (though the results seem formally new). It turns out that, due to 
its abstract nature, the approach is also applicable in the context of automata 
theory, which we would like now to demonstrate. This application is founded 
on a close relationship of star-free regular languages to first order definability 
established in [MP71]. 

By remarks at the beginning of this section, there is a natural one-one cor- 
respondence between subsets of and classes of finite CLO-models closed 
under isomorphism. This induces some notions on words corresponding to no- 
tions on models introduced above; we will use some of these notions under the 
same names. Relate to any sentence p the language L~^ = {u G H+|u |= p}. 
By [MP71], such languages are exactly the regular star-free languages. Let 
S^ , B{Spp), be defined as the corresponding classes above, but with L+ in 
place of M; in particular, = {ip\3p G = L'^)}. Then {i?(S'+)}„>i is 

the version of the Straubing hierarchy mentioned in the introduction. 




A Logical Approach to Decidability 543 



Note that there is an evident relationship between classes . and cor- 
responding classes without -b, namely S'„ C and so on. But the -b- classes 

contain a lot of new sentences. E.g., we have % B{Si) (the sentence saying 

that the ordering is dense belongs to but not to S' 2 ). 

Recall [CKa96, Theorem 7.2] that a well partial ordering is a partial ordering 
such that for any nonempty subset X the set of all minimal elements of X is 
nonempty and finite. 

Lemma 2.3. (i) (A+;C) is a well partial ordering. 

(a) 4> G iff there is no 1-alternating chain of words for </> of length k. 

(Hi) (f G iff there is no infinite alternating chain of words for (f. 

(iv) 4> G B{St) #VU3u C U(Vv(u C v C U ^ v ^ VVv(u C v C U ^ 

V 1= 

Proof, (i) is a well known result of G. Higman (see e.g. [CKa96, Theorem 7.2]). 

(ii) From left to right, the assertion follows from Lemma 2.2. (iii). Now let 

there is no 1-alternating chain of words for (f of length fc; we have to show (f> G D() . 
For simplicity of notation, consider only typical particular case k = 2; then there 
are no words wq, ui,U 2 G A+ with uq C m C U 2 and Uq ]=</>, Ui ]= U 2 ]= 4>. 
Let Cq = {u G A+jduo G A^{uq C m a Uq \= </>)} and Ci = {u G A+]3wo,mi G 
A'^{uq C ui C uAuq ]= (()Aui \= One easily checks that L~)f = Co\C\. By 

(i), any of Cq, C\ is either empty or of the form {v G A+jvo C u V . . . V C v} 
for some m > 0 and vq, ■ ■ ■ ,Vm G A+. This easily implies that Ci = for some 
4>i G 271(1 < 1 ). Then LJ = Hence, </> G completing the proof. 

(iii) From left to right, the assertion follows from (ii) and the equality B{S () ) = 
It remains to show that for any (f ^ B{Si) there is an infinite alternating 

chain of words. By (ii), there are alternating chains of words for </> of arbitrary 
finite length. 

Let u>* be the set of all finite sequences of natural numbers, including the 
empty sequence e. We construct a partial function u : oj* ^ A* as follows. 
Let u{e) = £ and suppose, by induction on jrj, that u{t) is already defined. If 
jrj is even then find m G ui and words vg, ■ ■ ■ ,Vm G A~^ enumerating without 
repetitions the C-minimal elements in = {u G A'^\u{t) C v A v ]= (/>}. Then 
we set u{Ti) = Vi for i < m and u(ti) is undefined for i > m. For jrj odd, the 
definition is similar, but we use the set X = {v G A~^\u{t) C u A v ^ 

From (i) and (ii) easily follows that {r G oj*\u{t) is defined} is an infi- 
nite finitely branching tree (under the relation of being an initial segment) . By 
Konig’s lemma, there is an infinite path through this tree. The image of this 
path under u provides the desired infinite alternating chain for </>. 

(iv) Let (j) G B{Si), then LJ = for a Boolean combination if of 27°- 

sentences. Note that G S 2 and any U satisfies one of Hence, the 

condition on the righthandside of (iv) follows from Lemma 2.1. (iii). 

Conversely, suppose that (f ^ B{S(~). By (i), there is an infinite alternating 
chain uq C ui C . . . for (p consisting of finite models of CLO. Then U = UfcUk is 
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a countable model of CLO for which the condition on the righthandside of (iv) 
is false. This completes the proof. 

Repeating the proof of Theorem 2.1, one immediately gets 
Theorem 2.2. Classes , B{Si), {k > 0) are decidable. 

Remark 2.2. Till now, we were unable to prove (by purely logical means) the 
known fact that the class is decidable. 

Note that Lemma 2.3 and Theorem 2.2 provide new, shorter proofs for several 
known facts from automata theory (cf. e.g. [St85,Pin86,SW99]). E.g., decidability 
of B{Si) is equivalent (using a simple observation of Section 4 below) to the 
well-known result on decidability of the class of so called piecewise testable 
languages. 

Our method is also applicable to some other similar situations, and now we 
want to give a couple of examples. There are several natural modifications of the 
operation (p i— > L~^, among the most popular are w-languages = {a : uj ^ 
A\a ^ (j)} and Z-languages {Z is the set of integers) = {a : uj ^ ^|{q^ 1= </>}> 
where a is the structure defined similarly to the case of finite words (one could 
even consider ’’words” over more exotic linear orderings, say rationals or w^). 
Such operations induce corresponding classes of sentences S'))', B(S))'), and so 
on. Are such classes of sentences decidable? 

Till now, we were unable to answer this question using the methods developed 
above (the problem is that we do not see an appropriate analog of Lemma 2.3 
for the infinite words). But the methods become applicable if we add to infinite 
words the finite ones, i.e. if we consider ’’languages” like U LJ, which 

are also traditional objects of automata theory, and the corresponding classes 
of sentences S'))^, . . .. Let us formulate the analog of Theorem 2.2 for w-words 
(similar results hold also for other kinds of infinite words). 

Theorem 2.3. Classes 3‘p'^ , B{S'^~^), (k > 0) are decidable. 

Proofsketch. From Lemma 2.1.(iii) it follows that ii (p & then L‘^~^ is ap- 
proximable (i.e., for any w-word a G there is a finite word u C a such 
that v \= p for any finite word v with u C v C a). Repeating the proof of 
Lemma 2.3, one obtains analogs of assertions (ii), (iii) and (iv) for the classes 
B{Si~^),D‘l^^{k > 0); but one have to add to the righthandsides of these asser- 
tions the condition that both and are approximable. 

With analog of Lemma 2.3 at hand, it is easy to adjust also the proof of 
Theorem 2.1 to our case. In place of the set B we shall take now the set B\ = 
< w}; it is definable and (Ri; ^) has order type w. It remains to modify 
the notion of inner structures in such a way that their universes are subsets of 
Bi. This completes the proof. 
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3 Brzozowski-Type Hierarchies 

Here we shall consider some versions of the well known Brzozowski hierarchy. 
Some results of this Section have some relevance to independent papers [T99, 
GSOO, SOO]. I am grateful to an anonymous referee for hints to these papers. 

Following [Th82] (with some minor changes), we enrich the signature a of 
the preceding section to the signature cr' = ct U {_L, T,p, s}, where _L and T are 
constant symbols while p and s are unary function symbols (_L, T are assumed 
to denote the least and the greatest elements, while p and s are respectively 
predecessor and successor functions). Let us also add to the axioms of CLO the 
following axioms: 

Vx(_L < X < T), 

Vx(p(x) < X A ^3y(p(x) < y < x)), Vx(x < s(x) A ~^3y{x < y < s{x))), 

Vx > _L(p(x) < x) and Vx < T(x < s(x)). 

We denote the resulting theory CLO' . For models U, V of this theory, U C' 
V means that U is a substructure of V respecting all symbols from a' . 

There is also a ’’relational” version of CLO' defined as follows. Let a" = 
CT U {_L, T, S'}, where S is a binary predicate symbol (_L, T are as above, while S 
denotes the successor predicate). Let CLO" be obtained from CLO by adjoining 
the axioms 

Vx(_L < X < T), 

Vx, y{S{x, y) ^ X < y A Sz{x < z < y)), 

Vx < T3yS{x,y) and Vx > ±3yS{y,x). 

Using the standard procedure of extending a theory by definable predicate 
and function symbols (see e.g. [Sh67]), one easily sees that CLO' and CLO" are 
essentially the same theory (e.g., every model of one theory may be in a unique 
way considered as a model of another, the natural translations respect classes 
of sentences S'„ and analogs of other classes from Section 2, any of these classes 
modulo one theory is decidable if and only if it is decidable modulo the other 
theory, and so on). For this reason our notation will not distinguish between 
these theories. 

From the axioms easily follows that countable CLO'-models consist of all 
finite CLO-models and all countably infinite CLO-models of order type uj 3- Z ■ 
L + uj~ , where w, u )~ , Z are respectively order types of positive, negative and all 
integers, L is a countable (possibly empty) linear ordering, Z ■ L is the linear 
ordering obtained by inserting a copy of Z in place of any element of L, and + 
is the operation of ’’concatenation” of linear orderings. 

For the theory CLO' the analogs of Lemmas 2.1 and 2.2 hold true with some 
evident changes in formulation (say, the righthandside of 2.1.(iv) now looks like 
VU h </>3u C UVV(u C V C' U — > "V \= (p), where C has the same meaning as 
in Section 2 and u is a finite CLO-model). 
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Repeating now the proof of Theorem 2.1, we immediately get the following 
assertion, in which classes of sentences are defined just as in Section 2, but 
modulo theory CLO' . 

Theorem 3.1. The classes of sentences Si, S 2 , B{Si), Dk{k > 0) modulo theory 
CLO' are decidable. 

Remark 3.1. As in Section 2, the proof of Theorem 3.1 implies the decidability 
of the monadic second-order theory of the class of countable CLO'-models. 

We see that for the case of all countable ’’words” the theory CLO' is treated 
quite similarly to the theory CLO. 

Let us now turn to finite words. Classes of sentences 5”+, B(S'+), are 
defined by analogy with Section 2. Again, as in Section 2, these classes include the 
corresponding classes without -I-, but the converse inclusions are far from being 
true. E.g., the sentence 3xQa{x) A Vx > L{Qa{x) 3y{± < y < x A Qa{y))) 
belongs to but not to S' 2 . 

The treatment of the -I— classes modulo theory CLO' turns out to be more 
complicated, as compared with CLO. A reason is that if U C' V and one of the 
CLO'-models U, V is finite then U = V. Hence, the analog of Lemma 2.3 is 
false. 

In this situation, the following notion from [St85] is of some use. Let u = 
u\ . . . Um and v = vi . . .Vn he words from A+, Ui, Vj G A. A k-embedding from u 
to V is an increasing function 0 : {1, . . . , m} {1, . . . , n} such that 

(i) = j,j = 1, ■ • .,min{k,m), 

(ii) 9{m — j) = n — j,j = 0, . . . , min{k — l,m — 1), 

(iii) Ui+j = vg^i)+j,i= = 0,...,k,i + j <m. 

This means that u is a subword of v including the first k letters of v and the 
last k letters and such that any letter used to build u is followed by the same k 
letters in u and in v. 

We write u v to denote that there is a fc-embedding from u to v. For finite 
CLO- models u and v, we write u C* v to denote that u C v and the identity 
function is a fc-embedding from u to v (u and v are words corresponding to the 
models as in Section 2). With some evident modifications we may apply the last 
relation also to countably infinite CLO-models. 

We concentrate on formulations and analogies with Section 2, skipping (fol- 
lowing a referee suggestion) rather technical proofs. 

Lemma 3.1. (i) If u v then u <* v. 

(ii) is a partial ordering. 

(iii) <° coincides with C. 

(iv) If u Cf' V then au <* av for any a G A. 

(v) For all u and k, there is an existential a' -sentence fff such that u U 

Let A* be the set of sentences equivalent in the theory CLO' to a finite con- 
junction of finite disjunctions of sentences G A+). Let {D’f}n be the differ- 
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ence hierarchy over . Then we have the following analog of Lemma 2.3. (i) — 
(hi). 

Lemma 3.2. (i) (A+;C*) is a well partial ordering. 

(a) (j) G iff is closed upwards under <^ . 

(Hi) 4> € D* iff there is no 1-alternating C)^ -chain of words for </> of length 
n. 

(iv) 4> G B{E^) iff there is no infinite alternating fff -chain of words for 4>. 
The analog of Lemma 2.3. (iv) is more intricate. In the following assertion the 
boldface letters have the same meaning as in Section 2. 

Lemma 3.3. (i) // u C v C* U and u C* U then u v. 

(n) 4> G B(L;'=) iffVU3u C U(Vv(u CvC'=U^vh</>)V Vv(u C v C'= 
U ^ V 1= ^(j))). 

Repeating now the argument from Section 2, we get the following general- 
ization of Theorem 2.2 (by Lemma 3.1. (hi), Theorem 2.2 is obtained if one takes 
fc = 0). 

Theorem 3.2. For all k and n, classes D’f and B{E^) are decidable. 

Let us now show that E^~^^ contains many new sentences as compared with 

n 

Lemma 3.4. If the alphabet A contains at least two letters then E^^^ (f- B{E^) 
for any k. 

Now we shall relate the classes and I?+. Let n > . . . ,Wn G A+, = 

|rci| and wi . . .w„ = ai . . . am{aj G A,m = l\ In). Let . . . , ru„) be 

a T'j’-sentence of signature a" saying that there exist < • • • < Xm such that 
Xi = E,Xm = T, Qaiixi) for i = 1, . . . ,m and S{xi,Xi+i) for t G {1, . . . , m} \ 

{ll, li -\- I 2 , . ■ . ,h + • • ■ + ^n-l}- 

Lemma 3.5. (i) u ^ 4>{wi , . . . , w„) iff u = w\ViW 2 V 2 ■ . ■ Wn for some vi, . . . , v„-i 
G A*. 

(a) For any 4> G yf 0, there is a disjunction ip of sentences of the 

form (p{wi,. . . , Wn) satisfying L+ = L+. 

Now we can state the desired relationship. 

Lemma 3.6. (i) S)) = UkE^. 

(a) For any n, = UfcD*. 

(m) B{St) = An.kDt 

Theorem 3.2 together with a result from [St85] implies 
Corollary 3.1. The class B{S(~) is decidable. 

Corollary 3.1 is equivalent to the well-known result that the class of so called 
languages of dot-depth one is decidable. 

Remark 3.2. Unfortunately, results of this section are not so complete and 
elegant as those in Section 2. The proof of the corollary is not completely satis- 
factory from the point of view of methodology of our paper, because it uses an 
automata-theoretic argument (in the proof of the cited result from [St85]). 
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4 The Empty Word 

Here we relate the hierarchies considered above to the ’’real” Stranding and 
Brzozowski hierarchies which classify subsets of A* (rather than 4+). We state 
a simple relationship that aims to avoid annoying discussions (and sometimes 
even confusions) caused by the role of the empty word e in this context. 

The Stranding hierarchy is defined as follows (see [PP86]): let Bq = Ao = 
{0, 4*}; let Bn+i be the closure of An under fl, U, and the operation relating to 
languages X, Y and a letter a G A the concatenation language XaY; finally, let 
An+i = B{Bn+i) be the Boolean closure of Bn+i- The sequence {Bn} is known 
as Strauhing hierarchy. 

In [PP86], a natural logical description of the introduced classes of languages 
was established. Namely, classes of sentences and 14 were found such that 
Bn = {L^\4> G Sn} and An = {L4,\4> G 14}. Here is defined similarly to 
the the language Lj in Section 2, but now the empty structure is also admitted 
(with a natural notion of satisfaction) . 

Let Sn = G S:^}, where 5”+ is the class from Section 2. For T C 

P(4+), let T® = {X U {£}|4T G X}. Then the desired relationship between 
introduced classes looks as follows. 

Theorem 4.1. For any n> 0, U 5® and An = B{Sn) U B{SnY ■ 

Proofsketch. First note that C Bn (if X G then X = for a sentence 
G C Sn starting with the existential quantifier; hence e ^ and X = 

G Bn-) 

The desired equalities are checked by induction on n. We have already proven 
that C B\. Note that {e} = L^, where </> is Vx(x Y hence {e} G B\. But 
Bi is closed under U, so C Bi and 5i U ^B\. 

For the converse, recall that B\ is a closure of Ao, hence for proving the 
inclusion C U it suffices to show that the class 5i U contains Ao 
and is closed under U,n and the operation XaY. Only the last assertion is 
not evident, so let us deduce XaY G 5i U from X,Y G U By the 
cited result from [PP86], X = L^f, and Y = for some </>, G Si. Let 9 be 
^x{Qa{x) A A where and are evident relativizations of 

(j) and Y, respectively. By definition of Si [PP86], 0 G Si , hence XaY G 5i. 

The equality Ai = B{Si) U B{Si)^ is easy, which completes the induction 
basis. The argument of induction step is almost the same as for the basis. This 
completes the proof. 

Let {T>n.k}k be the difference hierarchy over and {'F>'^j.}k be the difference 
hierarchy over Bn. Using Theorem 4.1 and an evident set-theoretic argument, 
we get 

Corollary 4.1. For all n and k, V'^ f. = Vn^k U 1?^ f.. 

A similar relationship exists between the Brzozowski hierarchy and the cor- 
responding classes from Section 3. 
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5 Conclusion 

We see that some problems of automata theory not only may be formulated in 
a logical form, they can be even solved by logical means. It is natural to ask a 
general logical question generalizing problems considered in Sections 2 and 3. 
For a given theory T, let be the set of sentences equivalent in the theory T 
to a 27° -sentence. Let be defined similarly but using the equivalence in finite 
structures. One can define also classes Dn^k{D^k) difference hierarchy 

over Sn (respectively, over S^), and even classes of the fine hierarchy over {S'„} 
(see [Se91, Se95]). 

The general question is to determine in what cases the introduced classes of 
sentences are decidable. Problems considered in Sections 2 and 3 are obtained 
when one considers the theories CLO and CLO' in place of T. 

The question is quite traditional for mathematical logic, hence one could hope 
to find some relevant information in the logical literature. Indeed, in [Ma7I] we 
find (with the reference to source papers) the following result: if T is undecidable 
then so are for all n > 0. But what about the more interesting case of 
a decidable theory T (which is the case for CLO and CLO')? It seems that, 
strangely enough, there is almost nothing known about this natural problem. 
From results in [Se91a, Se92] (which rely upon Tarski elementary classification 
of Boolean algebras) one can easily deduce the following result. 

Theorem 5.1. Modulo theory T of Boolean algebras, all classes Dn^k (and even 
all classes of the fine hierarchy) are decidable. 

Proof. In [Se9Ia, Se92] we have described an effective sequence of sentences 
. . . such that any sentence </> is equivalent (modulo theory of Boolean 
algebras) to exactly one of 4>i, and position of any (fi in the hierarchy {Dn^k} was 
completely determined. This evidently implies the desired algorithm completing 
the proof. 

It seems interesting to consider analogs of Theorem 5.1 for other popular 
decidable theories, say for Abelian groups. 

We hope that methods developed in this paper may be used in some other 
similar situations, say for the case of tree languages. 
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Abstract. We give an algebraic characterization of the regular languages defined 
by sentences with both modular and first-order quantifiers that use only two vari- 
ables. 



1 Introduction 

One finds in the theory of finite automata a meeting ground between algebra and logic, 
where ditficult questions about expressibility can be classified, and very often effec- 
tively decided, by appeal to the theory of semigroups. This line of research began with 
the work of McNaughton and Papert [7], who showed that the languages definable by 
first-order sentences over ‘<’ are precisely the ‘star- free’ regular languages, and thus, 
by a theorem of Schiitzenberger, the languages whose syntactic monoids are aperiodic- 
that is, contain no nontrivial groups. Let us give an example of the kind of first-order 
formulas we are considering: 

3x3y{QfjX A QaV A x < y A ~<3z{x < z A z < y)). 

This sentence is meant to be interpreted in words over a specified finite alphabet S that 
contains the letter cr. The variables in the sentence denote positions in the word (that 
is, integers between 1 and the length of the word, inclusive) and the subformula Q„x 
means ‘the letter in position x is ct’. The subformula x < yA^3z{x < zAz <y)) says 
that position x is to the left of position j/, and that there is no position strictly between 
them {i.e., that y = x + 1), and thus the whole sentence says ‘there are two consecutive 
occurrences of cr’. We say that the sentence defines a language over 27, namely the set 
of all strings that contain the factor aa. 

Since McNaughton and Papert’s work, researchers have investigated the expressibil- 
ity of regular languages in various restrictions and extensions of first-order logic over < . 
For example, we can replace the predicate x < y by the (weaker) predicate y = x + 1 
(Beauquier and Pin [2]). We can permit the use of modular quantifiers, which count, 
modulo a fixed period, the number of positions of a string satisfying a given condition 
(Straubing, Therien and Thomas [16]). For example, the sentence 3^ "^xQ^x is in- 
terpreted to mean ‘the number of positions containing cr is congruent to 1 modulo 2’, 
and thus defines the set of strings containing an odd number of occurrences of a. 
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In all the cases considered, the family of regular languages obtained can be charac- 
terized in terms of the syntactic monoids or the syntactic morphisms of its members, 
and in most cases this characterization gives rise to an algebraic algorithm for deciding 
membership of a given language in the family. The book by Straubing [14] provides an 
exhaustive catalogue of such results. 

Kamp [6] and later, Immerman and Kozen [5] showed that every first-order sen- 
tence over < is equivalent to such a sentence in which only three variables are used. 
The number of bound variables that occur in a formula can be considered as a kind of 
expressibility resource, along with the kinds and depth of the quantifiers and the set of 
available atomic formulas. 

Therien and Wilke [17] considered the regular languages defined by sentences in 
which only two variables are used, and found that these, too, could be characterized 
in algebraic terms: A language L is definable by a sentence with two variables if 
and only if its syntactic monoid belongs to a particular family DA of finite aperiodic 
monoids. (We will give the definition of DA in the next section.) It was already known 
that the two-variable definable languages are precisely those definable in the fragment 
of propositional temporal logic that includes both the past and future versions of the 
Next and Eventually operators, but excludes the Until operator (Etessami, Vardi and 
Wilke [4]). Since it is possible to determine from the multiplication table of a finite 
monoid whether it belongs to DA, the Therien- Wilke result provides an algorithm for 
determining whether a given regular language is definable in this fragment of temporal 
logic. 

In the present paper we investigate the effect of bounding the number of variables 
in sentences that include the modular quantifiers 3 "^ " as well as ordinary first-order 

quantifiers, and we characterize, again in algebraic terms, the regular languages that are 
thereby defined. 

We have shown that the three-variable property continues to hold for formulas that 
include modular quantifiers. That is. 

Theorem 1. Let (f> be a sentence over < containing first-order and modular quantifiers. 
Then is equivalent to such a sentence with only three variables. 

For formulas that contain only modular quantifiers, we have an even stronger result: 

Theorem 2. Let be a sentence over < in which only modular quantifiers appear. 
Then is equivalent to such a sentence with only two variables. 

Our main theorem is that the languages L defined by two-variable sentences are 
characterized by membership of their syntactic monoids M{L) in the pseudovariety 
DA * Gsoi, defined in Section 2: 

Theorem 3. Let S be a finite alphabet. A regular language L C S* is defined by 
a two-variable sentence over < containing first-order and modular quantifiers, if and 
only if M{L) G DA * Gsoi- 

It is important to remark that while our main theorem permits us in many individual 
cases to show that a language is, or is not, two-variable definable, the general problem 
of determining membership in DA * Gsoi is not known to be decidable. 
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Example. Let S = {cr, r}, and let L be the language defined by the regular expression 
(ctt)* . Thus w G L if and only if w contains no occurrence of the factor ucr or tt, and w 
is either the empty string, or begins with cr and ends with t. We saw above how to write 
a first-order sentence that says a string contains no occurrence of aa or tt. We can say 
that a string is empty or begins with a with the sentence: Vx(Vy(^j/ < x) ^ Qax), and 
we similarly say that a string is empty or ends with t. Thus L is definable by a first-order 
sentence. We could have obtained the same conclusion by constructing the syntactic 
monoid of L and verifying that it contains no nontrivial groups. A closer look at the 
syntactic monoid shows that the image of the word ctt under the syntactic morphism is 
idempotent, but that the image of cttct is not idempotent. This implies M{L) ^ DA , so 
by the theorem of Therien and Wilke cited above, L cannot be defined by a first-order 
sentence with only two variables. But L is definable by a two-variable sentence if we 
permit modular quantifiers: A string belongs to L if and only if it has even length, and 
has CT in all the odd-numbered positions and t in all the even-numbered positions. Thus 
we can define L by the sentence 

3° = x) A WyiQaV ^ 3° "^xix < y)). 

This example shows that the situation is more complicated, and potentially more 
interesting, than what one might have supposed, since the modular quantifiers can be 
used to economically express properties that are not intrinsically periodic (that is, that 
do not require modular quantifiers for their expression). Incidentally, the same language 
is definable by another two-variable sentence whose modular quantifiers are all of mod- 
ulus 3; in fact, any modulus greater than 1 will suffice. On the other hand, the set of 
strings over {a, t} that do not contain an occurrence of a a has a syntactic monoid that 
is not in the pseudovariety DA * Gsoi — this will follow from our results in Section 5 
concerning the ideal structure of the monoids in this family — and thus, by Theorem 3, 
cannot be defined by a two-variable sentence. 

2 Background from Semigroup Theory 

A semigroup is a set together with an associative multiplication. If the semigroup con- 
tains a multiplicative identity element, then we call the semigroup a monoid, and we 
usually denote the identity by 1 . If A’ is a finite alphabet, then E* , the set of all strings 
over S, is a monoid with concatenation of strings as the multiplication. S* is the free 
monoid on E: This means that if M is any monoid and f : E ^ M any map, then / 
extends to a unique homomorphism from E* into M. 

For a discussion of basic facts about semigroups and monoids, particularly as they 
pertain to the theory of automata {i.e., recognition of regular languages by finite 
monoids, syntactic monoid and syntactic morphism, wreath product and semidirect 
product) the reader is referred to the books by Eilenberg [3] and Pin [9]. Here we briefly 
discuss the ideal structure of finite semigroups and some facts about pseudovarieties of 
finite semigroups and monoids. 

Ideal Structure and Green ’s Relations. If S' is a semigroup and / is a nonempty subset 
of S, then we say I is an ideal of S if SI C / and /SC/. Similarly, we say that / 
is a right ideal of S if IS C /, and a left ideal if SI C /. If / is an ideal, then the set 
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(S' — /) U {0} forms a semigroup with multiplication x given by si x S 2 = siS 2 , if 
si, S2, S1S2 € S — and si x S2 = 0 otherwise. We denote this semigroup by S/I; 
this is the image of S under the homomorphism that collapses all the elements of / to a 
single element. 

If s, t € S we write s <j f if s belongs to the ideal {t} U St U tS U StS generated 
by t. If s <j t and t <j s, then we say that s and t are J -equivalent and write s =j t. 
The equivalence classes for this relation are called -classes. Similarly we define <c, 
<TZ, =c, =ni C-class, and TZ-class, by considering left and right ideals in place of 
two-sided ideals. 

A JT -class is said to be regular if it contains an idempotent, that is, an element e such 
that = e. The Rees Matrix Theorem, which we now state, describes the structure of 
the regular fT -classes of a finite semigroup. If A, B are finite sets, G a finite group, and 
P : Bx A ^ GU{0} a map, then {A, B, G, P) denotes the semigroup Ax Gx i?U{0} 
with multiplication given by 

{a,g,b){a',g',b') = {a,g ■ P{b,a') ■ g',b'), 

if P{b,a') ^ 0, and {a, g,b){a' , g' ,b') = 0 otherwise. For each regular fT-class J 
of a finite semigroup, there exist finite sets A and B, a finite group G, and a map 
P : B X A — > GU {0} such that the semigroup J U {0} is isomorphic to (A, B, G, P). 
Under this isomorphism, the P-classes contained in J are the sets {a} x G x B, where 
a G A, the £-classes are the sets A x G x {&}, where b G B, and every P-class 
contains at least one idemptotent, as does every £-class. We call (A, B, G, P) a Rees 
matrix representation of J. 

A non-regular fT-class is called a null jT-class. If J is a null fT-class of a finite 
semigroup and s,t G J, then st ^ J. 

Pseudovarieties. A pseudovariety of finite semigroups is a family of finite semigroups 
that is closed under finite direct products, submonoids and homomorphic images. A 
pseudovariety of finite monoids is defined analogously. If Vi and V2 are pseudovari- 
eties of finite monoids, then Vi * V2 is defined to be the pseudovariety generated by 
all wreath products Mi o M2, where Mi € Vi, M2 € V2. If Vi is a pseudovariety of 
finite semigroups, and V2 a pseudovariety of finite monoids, then V("^V2 consists of 
all finite monoids M for which there exist finite monoids K, N and homomorphisms 
(j) : K ^ N , tp ■. K ^ M, such that ip is onto M, N G 'V 2 , and, for each idempo- 
tent e £ N, the semigroup (p~^{e) belongs to Vi. V(”^V2 is a pseudovariety of finite 
monoids. 

In this paper we will be concerned with the following pseudo varieties of finite 
monoids: A-the finite aperiodic monoids; that is, the finite monoids that contain no 
nontrivial group; G-the finite groups; Gsoz-the finite solvable groups; R-the finite TZ- 
trivial monoids; that is, the finite monoids with one-element 7 ^-classes; DA-the finite 
aperiodic monoids each of whose regular fT-classes J is a subsemigroup. (That is, in 
each Rees matrix representation (A, B, G, P) of J, G is trivial and P never maps to 0.) 

We will also consider the pseudovariety LI of finite semigroups consisting of all 
semigroups S such that ese = e for all e,s G S with e idempotent. Such semigroups 
are called generalized definite or locally trivial in the literature. 
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Our main objects of study are pseudovarieties of the form DA * H, where H is 
a pseudovariety of finite groups. There are several alternative characterizations of this 
pseudovariety: 

Lemma 1. For any pseudovariety H of finite groups, 

DA * H = DA^^H = LI"^(Ji * H). 

For every finite semigroup S there is a reversed semigroup 5'’’®", with the same 
underlying set as S', and with multiplication x given hy s x t = ts for all s, f G S. If 
V is a pseudovariety of semigroups or monoids, then V’’®’' denotes the pseudovariety 
consisting of the reversals of members of V. If H is a pseudovariety of finite groups, 
then H = H’’®", since the map h ^ h ^ is an anti-isomorphism of any group. Since 
obviously DA = DA*^®", we find that DA^^H is closed under reversal as well. 



3 Formal Languages and Generalized First-Order Logic 

Let A be a finite alphabet. We build formulas from the unary predicate symbols {Qa- ■ 
a G U}, the binary predicate symbol <, variable symbols, the boolean connectives ^ 
and A, and two kinds of quantifier symbols: 3 and 3’’ where 0 < r < m. The 

atomic formulas are those of the form x < y, where x and y are variable symbols, and 
Q(jX, where a G S and x is a variable symbol. 

In our subsequent discussion, we will also use the boolean connectives V, — as 
well as the universal quantifier symbol V. These are all definable in terms of the original 
base of symbols. 

We have indicated in the introduction the meaning attached to the symbols in our 
formulas. (For a formal definition of the semantics of formulas, see Straubing [14].) If 
w G S*, and ^ is a sentence, then we write w \= (j) to mean w satisfies (j), and we 
say that the language {ru G F7* : w ^ (/>} is the language defined by fi. We will also 
consider pointed words (w,i), where w G S* and 1 < i < |w|. A formula <j) with a 
single free variable is interpreted in such a pointed word, by substituting i for the free 
variable. We write {w, i) \= 4>if the pointed word satisfies the formula. 

We will define a number of operations on formulas, which we call relativizations. 
Let (/) be a formula in which the variable x does not appear. The formula (/>[< x] is 
constructed recursively by beginning with the outermost quantifiers of and replacing 
each subformula 3*yip, where 3* is either an ordinary existential quantifier or a modular 
quantifier, by 3*y{y < x A ip[< x]). (If ip is an atomic formula then ip[< x] is identical 
to Ip.) Let Ip be a sentence, and let w G S* with |t(;| > i. Let v be the prefix of w of 
length i. Then {w, i) ^ xp[< x] if and only ifv\=ip. We define relativizations (p[< x], 
(p[> x], and (p[> x] analogously. 

Let 6 be a sentence with the property that w ^ 0 if and only if every prefix w of w 
satisfies 9. Let (pbea formula. We define the relativized formula <p\< 6] by recursively 
replacing each sub formula 3* xip by 



3* x{9[< x] A \p[< 0]). 
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Let w G S*, and let v be the longest prefix of w that satisfies 9. If 0 is a sentenee, then 
V \= 4>ii and only if m ^ 4>[< 9] . 

Let (j), 9 be as in the preceding paragraph. We define the relativized formula 4>[> 9] 
by recursively replacing each quantified sub formula 3* xxp by 

3*x{^9[< x] A^[> 9]). 

Let w G S*, and let w = vv' , where v is the longest prefix of w that satisfies 9. If (f> is 
a sentence, then v' \= 4> if and only if m ^ (f>[> 9] . 

In the full paper, we will prove Theorem 1, that every sentence in our language is 
equivalent to a sentence in which only three variable symbols are used. Our real inter- 
est is in what can be expressed with only two variables. To this end, we now sketch 
the proof of Theorem 2 and a slight generalization (Theorem 4 below), which we will 
use to prove our main result. Theorem 3. Let L C 27* be defined by a sentence in 
which only modular quantifiers are used. Then M{L) G Gsoi (Straubing, Therien and 
Thomas [16]). Thus, by a theorem of Straubing [15], L can be constructed, starting 
from the empty language, by repeatedly applying boolean operations and the opera- 
tions K i-^< K,a,r,n >, where a G S, 0 < r < n, and < K,a,r,n > denotes 
the set of strings w such that the number of factorizations w = uav with u G K 
is congruent to r modulo n. We make the claim (stronger than Theorem 2) that L is 
definable by a left-relativizable two-variable sentence; that is, a two-variable sentence 
4> such that the formulas </>[< x] and </>[< x], which each have one free variable, are 
themselves equivalent to two-variable formulas. The claim follows by noting, first, that 
the empty language is defined by the sentence 3 I ^x(x < x), which is certainly 
a left-relativizable two-variable sentence, and second, that if K is definable by a left- 
relativizable two-variable sentence (f), then < K,a,r,n > is defined by the following 
sentence ip: 

gr mod A ^[< xj). 

By assumption, (p[< x] is equivalent to a two-variable formula. Observe now that ip\< 
y] is equivalent to 

gr ^o'i^x{{x<y)hQ,xA(P[< xj), 

which has two variables, and that ip[< y] is equivalent to the same formula with x < y 
replaced by x < y. Thus ^ is a left-relativizable two-variable sentence. 

Precisely the same reasoning shows that the smallest family of languages closed 
under boolean operations and the operations K 1 — >< K,a,r,n > and K 1 -^ KaS* 
is definable by a left-relativizable two-variable sentence. It follows from results of 
Stiffler [13] that this is the family of languages whose syntactic monoids belong to 
the pseudovariety R * Ggoj. Thus we have: 

Theorem 4. If L C 27* is a regular language with M{L) S R * Gsoi 7 then L is 
definable by a left-relativizable two-variable sentence. 

4 Formulas and Games 

In this section we sketch the proof that every regular language definable by a two- 
variable sentence has its syntactic monoid in DA * Gsoi - This is one direction of our 




Regular Languages Defined by Generalized First-Order Formulas 557 



main result, Theorem 3. We will need the following normal form result, whose proof 
will be given in the full paper: 

Lemma 2. Let 0{x) be a two-variable formula with a single free variable x. Then 6 
is equivalent to a two-variable formula in which an ordinary quantifier never appears 
within the scope of a modular quantifier. 

Let us fix integers m > 1 and r > 0, and let us treat as atomie formulas all for- 
mulas with one free variable using exelusively modular quantifiers of modulus m and 
quantifier depth no more than r. Observe that there are only finitely many inequivalent 
formulas of this form. We look at two-variable first-order formulas over this base of 
atoms. By the depth of sueh a formula we mean the depth of nesting of the ordinary 
first-order quantifiers. In view of Lemma 2, it is sulfieient to prove that the syntaetie 
monoid of any language defined by sueh a formula is in DA * Gsoi- 

For eaeh A: > 0 we define two equivalenee relations, one on words, and the other on 
pointed words, both denoted =k'. We say wi =k W 2 if and only if wi and W 2 satisfy the 
same two-variable sentenees of depth k or less, and (wi, i) =k i'UJ 2 ,j) if and only if the 
two pointed words satisfy the same two-variable formulas 4>{x) (with one free variable) 
of depth k or less. 

Here is an explieit deseription of =o : Let Hm be the pseudo variety of finite Abelian 
groups of exponent m, and let be the pseudovariefy eonsisting of all finite groups 

that have a normal series of length r or less in whieh every quotient group belongs 
to Hm- For every finite alphabet S, has a finite A-generated free objeet F. Let 
7T be the eanonieal homomorphism from E* onto F. It follows from results in [14] 
that two words are =o -equivalent if and only if they have the same image under tt. 
Furthermore, two pointed words (wi,i) and {w 2 ,j) are =o-equivalent if and only if 
there are faetorizations wi = uav and W 2 = u'av' where a & E,\u\ = i — 1, |u'| = 
j — 1, 7t(u) = 7r(u') and 7r(v) = tt(v'). From this follows the important faet that not 
only is =o a eongruenee on words, but it is a congruence on pointed words in the sense 
that if (wi,i) =0 (w 2 ,j), ui =o U 2 , and vi =o W 2 , then i)ui =o U 2 {w 2 ,j)v 2 . 

(ui{wi,i)vi is shorthand for {uiwvi,i -F |ui|).) 

For fc > 0, we eharaeterize =k in terms of a variant, due to Wilke [18], of the 
Ehrenfeueht-Fraisse game. The game is played on two pointed words (rui,i) and 
{w 2 , j) • If these are not =o-equivalent, then Player I wins at onee, in zero rounds. Other- 
wise, eaeh round proeeeds as follows. Think of eaeh pointed word as an ordinary word 
with a pebble on one position. Player I pieks one of the words and moves its pebble one 
or more positions to the left or right. Player II must now move the pebble in the other 
word in the same direetion (left if Player I moved left, right if Player I moved right). 
The new pointed words and {w 2 ,j') are required to be =o-equivalent — Player 

II loses if she eannot meet this requirement. If Player II ean eorreetly respond for k 
sueeessive rounds, then she wins the game. We ean also play the game on words. In 
the first round. Player I plaees his pebble on a position in one of the words, and Player 
II pebbles a position in the other word. The resulting struetures (rui, i) {w 2 ,j) are re- 
quired to be =0 -equivalent, or Player II loses. Play then proeeeds as above for A: — 1 
additional rounds. 

It’s easy to prove that the standard result for model-theoretie games holds for this 
variant: 
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Lemma 3 . {wi,i) =k (w2,j) if and only if Player II has a winning strategy in the 
k-round game on these two pointed words. w\ =k W2 if and only if Player II has a 
winning strategy in the k-round game in these two words. 

It follows from this game characterization, and the fact that =o is a congruence 
on pointed words, that =k is a congruence of finite index on S* . By Lemma 2, every 
language defined by a two-variable sentence is a union of =fc-classes for some k, m and 
r. So it is enough to prove that the quotient monoid S* / =k belongs to DA * Gsoi- 

We will prove this by induction on k. S* / =o is the free 27-generated group in 
H^, and thus is in Gsoi- The passage from 0 to 1 is a special case: It follows from a 
result in Straubing [14] that the syntactic monoid of any language defined by a sentence 
of depth 1 is in Ji * *Gsoi (where ** denotes a symmetric version of the product * 
of pseudovarieties). From a theorem of Rhodes and Tilson [11], this is the same as 
Ji * Gsoi- Since each =i-class is such a language, it follows that S* / =iG Ji * Gsoi- 

We now carry out the inductive step from =k to =k+i, where A: > 1. Since =k+i 
refines =k, there is a homomorphism from E* / =k+ito E* / =k - We claim that the 
preimage of each idempotent under this homomorphism is in LI. Since DA * Gsoi = 
LI“^(Ji * Gsoi) (by Lemma 1) and LI“^(LI“^V) = LI“^V for every pseudovariety 
V of finite monoids, this will complete the proof 

Suppose u and v are =fc-equivalent words in E* , and are idempotent in E* / =k . 
Suppose further that u is idempotent in E* / =k+i . We need to show uvu =k+i u 
i.e., the inverse image of each idempotent satisfies the identity ese = e whenever e 
is idempotent. Since u is idempotent in E* / =fc+i, this is equivalent to uuvuu =k+i 
uuuuu. By Lemma 3 it suffices to show that Player II has a winning strategy on this pair 
of words in the (fc + l)-round game. The strategy is this: If Player I moves anywhere 
but the middle segment of one of the words. Player II will respond on the corresponding 
position in the other word. If Player I ever moves into the middle segment. Player II will 
respond according to her strategy for the fc-round game in u and v. If Player I moves out 
of the middle segment and back in again. Player II picks up the middle segment strategy 
again, starting from the beginning. This strategy will win the game for Player II unless 
Player I makes all his moves in the middle segments. In that case, after k rounds, the 
two pointed words are uu{v, i)uu and uu{u, j)uu. Suppose Player I now moves to the 
right in the first word, remaining in v, giving uu{v, i')uu with i < i' . Player II might 
not be able to respond in the middle segment of the other word. Instead, she picks a 
position / in u such that (w,i') =o ("«,/) (such a position exists because u =k v 
and fc > 1) and moves the pebble to the right to produce uuu{u, j')u. Since =o is a 
congruence on pointed words, and since u, being idempotent for =k, is idempotent for 
= 0 , the resulting pointed words are =o-equivalent. The same strategy works if Player 
I moves to the left in v, or moves in either direction in the middle segment of uuuuu. 
Thus whatever Player I does. Player II can play safely for A: + 1 successive rounds, and 
so wins the game. 

5 Ideal Structure of Monoids in DA * G 

In this section we state without proof some algebraic properties of pseudovarieties of 
the form DA * H, where H is a pseudovariety of finite groups. Let JF be a set of partial 
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one-to-one functions from a finite set X into itself We will denote the image ofx G X 
under / S IF by xf. We say that T is H-extendible if there is a finite set Y with X QY, 
and a permutation group G on Y such that G G H, and for each f G T there exists 
g G G such that / is equal to the restriction of g to the domain of /. 

Lemma 4. Let M G DA * H, where H is a pseudovariety of finite groups. Let : 
X* M be a homomorphism. Let J be a regular Lf -class of M, and let (A, B, G, P) 
be a Rees matrix representation of J. 

(a) There exist a partition of A, a partition of B, and a bijection between the sets of 

blocks of the two partitions such that P{b, a) 0 if and only if the blocks containing a 

and b correspond under the bijection. 

(b) Let B denote the set of blocks of the partition of B. Let s G M. There is a one-to- 
one partial function tTs '. B ^ B such that ifb G B, and B{b) is the block containing 
b, then B(b)TTs is defined if and only if (a, g,b)s G J, in which case B{b)TTs is the 
block containing the right co-ordinate of {a, g, b)s. Moreover, the set of partial functions 
{tTs : s G M} is H -extendible. 

(c) Let Bi, B 2 be two blocks of the partition ofB.Then the language 

{w G E* : = B 2 } 

is recognized by a monoid in3\ * H 

(d) Suppose H * H = H. Let (a, g, b) G J, g' G G. Then the language 

{w G r* : (a, g, b)fi{w) G {a} x {g'j x B} 
is recognized by a monoid in R * H. 



6 Two- Variable Definability for DA * Gsoi 

Let A be a finite alphabet. We will prove in this section that if L C E* is recognized 
by a monoid M G DA * Ggoi, then L is definable by a sentence with two variables. 
This will complete the proof of Theorem 3. 

Let (/>: E* ^ M be a homomorphism. Each w G E* has a unique factorization 
w = wofJiWi ■ ■ -akWk, where each Ui is in E, fiwo) =n 1, and where, for i = 



(j){woai ■ ■ ■ aiWi) =n f{woai ■ ■ ■ Wi-iaf) <n 4>{wo(Ji ■ • • Wi_i). 

Let s,t G M, with s = 7 ^ t. We define L[s,t] = {w G E* : s ■ f{w) = t}. Thus, if 
m G M, is the union of all languages of the form 

L[l, tf\aiL[to ■ (j){ai),ti\ • • • akL[tk-i ■ f{<7k),tk\, (1) 

where tk = m, and, for i = 1, . . . , fc, L =k L-i • <n L-i- This union is finite, 
since k is bounded above by the number of 7^-classes of M. It is therefore sufficient 
to show that every language of the form ( 1 ) is definable by a two-variable sentence. 
We prove this by induction on |M|: If |M| = 1, then the language (1) is E*, which 
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is defined by the one-variable sentence - 3 x{x < x). We thus suppose \M\ > 1 . Our 
inductive hypothesis is that for all M' G DA * Gsoi with \M'\ < \M\, languages of 
the form (1) are two-variable definable. 

We prove the assertion for M by a second induction, this time on k. We begin by 
considering languages of the form L[s,t]. First, suppose that the 7 ^-class containing 
s and t is contained in a regular fF-class J of M. We identify J with a Rees ma- 
trix representation {A, B, G, P). There is a partition of B as specified in Lemma 4 ; 
as before, we denote by B{b) the block of this partition containing the element b. Let 
s = {a, g,b),t = (o-'g', b'). In order for a word w to belong to L[s, t], we need either: 

(a) (j){w) ^ J and s(j){w) = t, or 

(b) (j){w) G J, B{b) is in the domain of 7T0(u,) , the middle co-ordinate of (a, 5, b)<p{w) is 
g', and w = wiaw2,'where a G B, (j){w2) ^ J, and (j){aw2) G J with right co-ordinate 
b'. 

The set of strings satisfying condition (a) is recognized by the monoid M/I, where 
/ is the ideal consisting of all elements of M that are not strictly above J in the J- 
ordering. If \M/I\ = \M\ then J consists of a single element, which is the zero of M, 
and L[s,t] = S* . Thus we may suppose \M/I\ < \M\. Since M/I G DA * GsoZ, the 
inductive hypothesis implies that this set of strings is two-variable definable. 

By Lemma 4 , the set of strings w such that B{b) is in the domain of 7T0 (u,) is rec- 
ognized by a monoid in Ji * Ggoi, and, since Gsoi * Gsoi = Gsoi, the set of strings 
w such that the middle co-ordinate of {a,g, b)(f>{w) is g' is recognized by a monoid 
in R * Gsoi- By Theorem 4 , these are both definable by left-relativizable two-variable 
sentences. Let a G B, and let be the set of strings wiaw2, where (/{W2) ^ J, and 
(/{(TW2) G J with right-co-ordinate b. Krj is then a union of sets of the form (1), but 
with the sets L[s, t] replaced by their £-class duals, with respect to the monoid M/I. 
Since, as we noted in the remarks following Lemma 1 , DA * Ggoi is closed under 
reversal, the inductive hypothesis implies that each Krj is two-variable definable. 

In the case where J is a null jT-class, the product of two elements of J is not in J. 
Thus L[s, t] is recognized by M /I, where / is as defined above, and is thus two-variable 
definable. 

We now suppose that we have a two-variable sentence 5 for the language 



P — , bi+l\^i+2 ' ' ' 



and use it to obtain a two-variable definition for L' = L[ti-i(f>{ai) , ti]ai+iL. First we 
consider the case where the fT-class J that contains and ti is regular. Let 

= (oi, (71,61), ti = (02,(72,62). Let 6 be a left-relativizable two-variable 
sentence for the set of strings u such that B{b\) is in the domain of 7T0(„), and rj a 
left-relativizable two-variable sentence for the set of strings u such that the middle co- 
ordinate of (oi, (71, bi)(f>{u) is (72. Such sentences exist by Lemma 4 and Theorem 4 . 

Let C be a two-variable sentence for the set of strings u such that u = u\au2, with 
a G B, 4 >{u 2 ) ^ J, and 4 >{au 2 ) G J with right co-ordinate 62. We showed above that 
such a sentence exists. Let (/' be a two-variable sentence for the set of strings u such 
that <j){u) ^ J, and {ai, gi,bi)<j){u) = (02, 52, 62). Again, we showed above that such 
a sentence exists. 
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Our sentence defining L' is 

3x((5cri+ia; A 9[< x] A ^9[< x]) A ((77 A C) V C0[< ^[> ^]- 

Observe that because of the left-relativizability of 9, all the relativizations in the above 
sentence are two-variable formulas. 

For the case of a null J -class, we proceed in the identical fashion, except now we 
do not need formulas analogous to rj and C- 



7 Directions for Further Research 

G soi-Extendibility.The biggest question left unanswered by our work is whether one 
can effectively determine if a given regular language is definable by a two-variable 
sentence. It follows from our arguments that L is two-variable definable if and only if 
for every regular J -class J of M (L) , J admits a block partition of the kind described 
in Section 5, and the set {tTs : s G M} of partial one-to-one transformations on the set 
of B-blocks of J is Ggoj-extendible. In fact, we are able to prove: 

Theorems. The following two decision problems are equivalent: (a) To determine 
whether a given regular language is two-variable definable, (b) To determine whether 
a given set of partial one-to-one functions on a finite set is G aoi-extendible. 

It is not known whether this latter problem is decidable. Margolis, Sapir and Weil [8] 
show that if H is a pseudovariety of groups such that H * H = H (GsoZ has this 
property) then the question of H-extendibility is equivalent to the problem of computing 
the closure of a finitely-generated subgroup of the free group in the profinite topology 
induced by H. Ribes and Zaleskii [12] showed that this problem is decidable for the 
pseudovariety Gp of p-groups for a fixed prime p. As a consequence we have 

Theorem 6. Let p be prime. It is decidable whether a given regular language is de- 
finable by a two-variable sentence in which all the modular quantifiers are of modulus 

P- 

Modular Temporal Logic. First-order logic over < is equivalent in expressive power to 
linear propositional temporal logic (LPTL), and two-variable first-order logic over < is 
equivalent to the fragment of LPTL that includes both the past and the future versions of 
the Next and Eventually operators, but not the Until operator. Baziramwabo, McKenzie 
and Therien [1] study an extension of LPTL that includes a modular temporal operator, 
and show that this has the same expressive power as sentences over < with both modular 
and ordinary quantifiers. In the full paper we will show that the fragment of modular 
temporal logic that includes the past and future versions of the Next and Eventually 
operators, as well as all the modular operators, captures exactly the languages in DA * 
Gaol • 

Related Model-Theoretic Questions. Pin and Weil [10] show that the languages whose 
syntactic monoids are in DA are exactly those languages that are simultaneously de- 
finable by both a E 2 and a il 2 -sentence over <, with no restriction on the number 
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of variables. In the full paper we will show that the analogous property holds for lan- 
guages whose syntactic monoids belong to DA * Gsoi — here “272-sentence” means a 
272-sentence over the base of atoms consisting of all the purely modular formulas. 

We would also like to know the effect of bounding the number of variables in sen- 
tences that use only the successor relation y = x + I in place of the more powerful 
ordering relation. 
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Abstract. Ordered binary decision diagrams (OBDDs) nowadays be- 
long to the most common representation types for Boolean functions. 
Although they allow important operations such as satisfiability test and 
equality test to be performed efficiently, their limitation lies in the fact 
that they may require exponential size for important functions. Bryant 
[8] has shown that any OBDD-representation of the function MUL„_i,„, 
which computes the middle bit of the product of two n-bit numbers, 
requires at least 2"^® nodes. In this paper a stronger bound of 2"^^/61 
is proven by a new technique, using a recently found universal family of 
hash functions [23]. As a result, one cannot hope anymore to find reason- 
able small OBDDs even for the multiplication of relatively short integers, 
since for only a 64-bit multiplication millions of nodes are required. Fur- 
ther, a first non-trivial upper bound of 7/3 • for the OBDD size of 
MUL„_i,„ is provided. 

1 Introduction and Results 

Binary Decision Diagrams (BDDs), introduced by Lee [15] and Akers [1], are 
a well established representation type of Boolean functions. While the general 
model has a large representational power and allows the simulation of any other 
general model of computation, its generality also has certain severe drawbacks. 
They lie in the NP-hardness of several important operations which should be 
available for a representation serving as a dynamic data structure (see also [21]). 
Among these operations are e.g. the equivalence test (which tests whether two 
representations describe the same function), the satisfiability test (which tests 
whether there exists a satisfying input for the represented function), or the 
minimization problem (which is to minimize the size of the representation of 
a given function). 

In order to overcome these drawbacks, Bryant [7] has introduced restricted 
BDDs, called ordered binary decision diagrams (OBDDs). 

Definition 1. Let A„ = {x \, . . . , x„} be a set of Boolean variables. 

1. A variable ordering tt on X„ is a permutation of the indices {l,...,n}, 
leading to the ordered list x,r(i) , • . • , of the variables. 
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2. A tt-OBDD on X„ for a variable ordering tt is a directed acyclic graph with 

the following properties: Each sink is labeled by a constant 0 or 1. Each 
inner node is labeled by a variable from Xn and has two outgoing edges, one 
of them labeled by 0, the other by 1. If an edge leads from a node labeled by 
Xi to a node labeled by xj, then This means that any path 

on the graph passes the nodes in an order respecting the variable ordering tt. 

3. A node v of a tt-OBDD is said to compute a Boolean function fy : {0, 1}" ^ 
{0,1}, if for any a = {ai, . . . ,a„) G {0,1}”, the path starting from v and 
leading from any Xi node over the edge labeled by the value of Oj, finishes 
at the sink with label f{a). A tt-OBDD with a root v is said to compute a 
Boolean function f , if v computes f. 

4- The size of a tt-OBDD is the number of its nodes. The tt-OBDD size of a 
Boolean function f (short: tt — OBDD(/)j is the size of the minimum tt- 
OBDD computing f. Einally, the OBDD size of f (short: OBDD(/)} is the 
minimum of tt — OBDD(/) for all variable orderings tt. 

Efficient algorithms on tt-OBDDs are known generally for all important op- 
erations, as e.g. the ones mentioned above (for an in-depth discussion of OBDDs 
and their operations see [21]). The size of a tt-OBDD though, can be quite sensi- 
tive to the chosen variable ordering tt, and finding a variable ordering leading to 
small or even minimal tt-OBDDs is a hard problem (see [4,7, 18]). Furthermore, 
since there exist 2^ Boolean functions of n variables, it can be seen that almost 
all functions require an exponential number of elements in any realization by net- 
works using only primitive elements. However, this still is not disturbing as long 
as for all practical relevant families of functions small representations of a cer- 
tain kind exist. Therefore, one is interested in finding exponential lower bounds 
for the size of OBDDs (and other representation types) computing important 
families of functions, such as integer multiplication. 

Definition 2. The Boolean function MULfc^„ : {0, 1}^” ^ {0, 1} computes the 
bit Zk of the product {z 2 n-i ■ ■ ■ zq) of two integers {yn-i ■ ■ ■ Vo) a'nd (xn-i . ■ . xq). 

The first step towards bounding the size of OBDDs for integer multiplication 
was done by Bryant in 1986 [7]. He showed that for any variable ordering tt, there 
exists an index k, such that the tt-OBDD size for MUL^^n is at least 2"/®. This 
result though, would still allow the possibility that one might obtain polynomial 
size OBDDs for all functions MULfc_„ by choosing different variable orderings for 
different output bits. In 1991, Bryant found that computing the middle bit of 
multiplication (that is MUL„_i^„) requires a tt-OBDD containing 2"/® nodes for 
any variable ordering tt [8]. More precisely, he showed that for any 1 < fc < n, 
OBDD(MULfc_i^„) and OBDD(MUL 2 n-fc-i,n) have a value of at least 2*/®. 

Although this proves the exponential size of OBDDs for multiplication, the 
bound is - as stated e.g. by Bollig and Wegener in [5] - not satisfactory. This is 
because Bryant’s bound would still allow the possibility that one can construct 
64-bit multipliers represented by OBDDs containing only 256 nodes, while on 
the other hand it is widely conjectured that OBDDs computing MUL„_i^„ have 
a size of at least 2". This would mean that such a multiplier could not even be 
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realized with millions of nodes. Since one would like to use OBDDs for realistic 
applications, such as verification of multipliers, one is interested in either finding 
such small constructions or a better lower bound. The following result, which is 
proven in the next two sections, provides the second alternative: 

Theorem 1. Any OBDD computing MUL„_i^„ has a size of at least 2L"/^J/61. 

This bound shows that any OBDD for 64-bit multiplication must be constructed 
of more than 70 million nodes and thus demonstrates a true weakness of the 
representation type. The technique leading to this result is new. It highly relies 
on a recently found universal family of hash functions [23] and makes use of a 
new lemma showing how such functions distribute two arbitrary sets over the 
range. Universal hashing is introduced in the next section. 

Since it is generally believed though, that the true bound on the OBDD size 
for MUL„_i^„ is still larger, it is of interest to have an upper bound, too. Note 
that for any Boolean function on m variables, there exists an OBDD of size 
(2 -I- o(l)) 2^/m [6], so a trivial upper bound for OBDD(MUL„_i^„) is roughly 
2^"/n. The following upper bound, proved in section 4, is the first non-trivial 
one. 

Theorem 2. There exists an OBDD for having a size ofllZ-T^"^!^. 

The bound shows that the middle bit of a 16-bit multiplication can be com- 
puted by an OBDD containing less than 6.2 million nodes. Although the proof 
is existential, constructions of OBDDs satisfying this bound can be derived from 
it. 

2 Universal Hashing 

The concept of universal hashing was introduced by Carter and Wegman in 
1979 [9]. While one of its original purposes was to use randomization in hashing 
schemes instead of relying on the distribution of the inputs, it has found over the 
years a large variety of applications in areas of all different kinds. They range 
from algorithms for the various types of hashing based dictionaries [9, 12, 13] 
over cryptographic aspects as message authentication [2, 17] up to complexity 
theoretical statements [14,20]. 

Universal hash families are usually defined by the following notation: Let hi 
be a family of hash functions U R. U and R are called universe and range, 
respectively. For arbitrary x,x' & U and h G hi, we define 

, ,, 1 1 if X yf x' and h(x) = h(x') 

Ohix, X ) = < 

1 0 otherwise. 

If one or more of h, x and x' are replaced in 5h{x,x') by sets, then the sum is 
taken over the elements from these sets. E.g., for H C TC, V C U and x G U, 
5h{x,V) means 

^ ^ 5h{x,x'). 
heH x'ev 
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Definition 3. A family H of hash functions U ^ R is universal, if for any two 
distinct x,x' G U 

Sn{x,x') < y. 

A stronger definition of so-called ’’strongly universal hash families” was given 
in [22] . Among the many applications, there were also interesting results concern- 
ing branching programs (or equivalently BDDs). So, Mansour, Nisan and Tiwari 
[16] investigated the computational complexity of strongly universal hashing, 
and gave a lower bound for the time-space tradeoff of branching programs com- 
puting the functions of such families. Further, Beame, Tompa and Yan [3] have 
found results on the communication-space tradeoff of strongly universal hash 
families in a general model of two communicating branching programs. 

For OBDDs, it is not possible to show a general exponential lower bound for 
universal hash families. E.g., the convolution of two bit strings can be viewed 
as a strongly universal hash family [16], but it can be easily seen that for any 
output bit of the convolution, there exists a variable ordering tt leading to a 
linear size tt-OBDD. 

The property of universal hash families we will use here, can be described in 
the following way: If there are two large enough subsets of the universe given, 
then there exists a hash function under which the function values of the elements 
from each set cover a large fraction of the range. This is in a way a twisted version 
of the known results, telling that there exists a function under which the number 
of collisions of elements from a set is small. 

For a function h : U ^ R and a subset M C U, define h{M) to be the image 
of M under h, namely 

h{M) := {y G R \ 3x G M : h{x) = y}. 

Lemma 1. Let H be a universal family of hash functions U ^ R and 0 < e < 1 . 
Then for arbitrary M, N CJJ with 

\M\ = \N\ > 2(|fi-]-l)^, 

there exists a hash function h G H such that h{M) and h{N) contain more than 
e|i?| elements each. 

Proof Let r=|i?|, m=|M| = |A^| and iov h G TL let the random variable Xh 
be the sum of Sh{M,M) and 6h{N,N). Using the universal property of Tt, we 
obtain for a randomly chosen function h an upper bound for the expectation of 

Xh-. 

E [Xh]= V E [6h(x,x')]+ V E [Sh(y,y')] 

h£H “ h£U “ h£U 

x,x GM y,y GN 

x^x' y^y' 

2 

= - m(m — 1). 
r 
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This means by the probabilistic method that there exists an /iq € with 

Xho < ^ iTi{m - 1 ). ( 1 ) 

In order to prove that this ho fulfills the claim, we assume that ho{M) con- 
tains at most er elements. By summing over the ordered pairs of elements in 
ho~^{y) n M for each y e ho{M), we get 

Sho{M,M)= ^ \ho~^{y)r]M\(^\ho~^{y)r]M\-l^ 

y^ho{M) 

= Y. \ho~\y)f^M\^ -\M\. 

y^ho{M) 



Clearly, the last sum takes its minimum, if each ho ^ (y) H M contains the same 
number of |M|/(er) elements. Therefore, 



Sho{M,M) > er(J^y-m = 

For N, we obtain with similar arguments that 6ho(N,N) > m(m/r — 1). So we 
have the following lower bound on Xh ^ : 



/m m \ 

Xho > 1 21 . 

\er r / 

Together with the upper bound from (1), we obtain 

2 fm m \ 

- m(m — 1) > ml 1 2 . 

r \er r / 



er 

By the assumption that m > 2(r— l)e/(l — e), this results into the contradiction 

2 1-e „r-l 

2 > m > 2 . □ 

r er r 



We now consider hash functions, which map the n-bit universe U := 
{0, ... ,2” — 1} to the fc-bit range Rk ■= {O, ... ,2^ — l}. For a,b GU let 

^a,b ■ ^ ^k, X ^ ((ax -I- b) mod 2") div2"“*, 

where ”div” is the integer division (i.e. xdivy = \x/y\). In a bitwise view, the 
result of the modulo operation x mod 2" is represented by the n least significant 
bits of X. On the other hand, the division xdiv can be seen as shifting x by 
n — k digits to the right. In other words, if the value of the linear function ax + b 
is represented by {y 2 n-i ■ ■ ■ Vo), then h^ is the integer, which is represented by 
the k bits (j/n-i . . .yn-k)- The following result has recently been established by 
the author [23,24]. 

Theorem 3. Let 1 < k < n. Then there exist sets A C U and B C 
{O, . . . , 2"“^ — 1} such that the family of hash functions /i* j, with a G A 
and b G B is universal. 

Similar hash classes have been investigated in [10, 11]. 
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3 Lower Bounds 

Since the functions are evaluated not only by a multiplication, but also by an 
addition, we cannot use Lemma 1 for the lower bound proof of OBDD(MULfc_„) 
directly. Let := ha,o be the functions that can be evalutated without addition. 
The following lemma gives a result similar to that of Lemma 1. Note that as 
stated in [11], the hash functions /„ form an ’’almost” universal hash class (which 
means that in Definition 3 |7f|/|i?| is replaced by c|7f|/|i?| for some constant c). 
This property though, is not sufficient to prove a result as strong as the one 
given below. 

Lemma 2. LetM,N C U and 1/2 < e < 1. //|M| = |7V| > 2(2'=+i-l)e/(l-e), 
then there exists an a G U, such that fa{M) and fa{N) contain at least (2e— 1)2* 
elements each. 

Proof. By Lemma 1 and Theorem 3, there exists an a S [/ and an 6 S 
{O, . . . , 2”“*“^ — l} such that and contain more than 

e|i?fc+i| = e2*+^ elements each. Let these a,b be fixed and / := /*. We show 
that f{M) contains at least (2e — 1)2* elements; the claim then follows for N 
with the same argument. 

Clearly, there exists a subset M' C M with |M'| = e2*+^, such that all 
X G M' have distinct function values under /i*“/^. Since Rk+i contains exactly 
2* even elements, there are at least |M'| — 2* elements in M' , which have an 
odd function value under h^\^. Let M” be a subset of M' containing exactly 
e2*+^ — 2* = 2*(2e— 1) elements with an odd function value. To prove the claim, 
it suffices to show that for any two distinct x,x' G M” we have f{x) yf f{x'). 
Let h^^^{x) = z and = z', thus 

z2”-*-i < (ax + &)mod2" < (z + l)2”-*-\ 

Since by definition 0 < 6 < 2"“*“^, it follows that 

(z-l)2”-*-i < (acr)mod2" < (z+l)2”-*-i. 

Further, by z being odd, (z — l)/2 equals [z/2j and (z + l)/2 equals [z/2j + 1. 
Therefore, the above inequalities imply 

Lz/2j2”-*< (ax) mod 2” < ([z/2j + 1)2”"*. 

This means that /(x) = [^/2J, and with the same argument also /(x') = [z'/2j . 
But because z and z' are both odd and different, clearly [z/2j and [z'/2j are 
different, too. So, we obtain the desired result /(x) yf /(x'). □ 

We are now ready to prove an intermediate result, from which the lower 
bound for the OBDD size of MUL„_i^„ follows easily. In order to do so, we 
have to introduce some more notation. Let x be an integer represented in a 
bitwise notation as (x„_i . . .xq). Then we write [x]k for the {k + l)-th bit Xfc. 
Further, let MUL^ „ : {0, 1}" ^ {0, 1} for a G {0, 1}" be the Boolean function 
that computes the (fc + l)-th bit of the product of a with an n-bit number, i.e. 
MUL^_„(x) =MULfc,„(a,x). 
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Theorem 4. Let tt be an arbitrary variable ordering on X„. Then there exists 
an a & {0, . . . , 2" — 1} for which any tt-OBDD for MUL“_j^ „ consists of at least 
2 L"/ 2 J/i 21 + 1 nodes. 

Proof. Let the input variables for the tt-OBDD be x„_i , . . . ,xq for an n, which is 
w.l.o.g. even. Consider the top part T of tt, which contains the first n/2 variables 
with respect to tt and the bottom part B containing the other n/2 variables. We 
construct now two sets M and N of numbers in {0, . . . , 2" — 1} as follows: M 
contains all numbers which can be represented by (cc„_i . . .xq) if the variables 
from T are set to 0, and N contains all numbers which can be represented by 
(xn-i-.-Xo) if the variables from B are set to 0. Note that any number in 
{0, . . . , 2” — 1} can be uniquely expressed as p + qiov p & M and q G N. 

Our goal is to find an appropriate constant a and two subsets M' C M, 
N' C N with the following property: For any distinct q, q' in N' , there exists 
such an p G M' that a{p + q) and a{p + q') differ in the n-th bit. More formally 

\/q,q & N' ,q^ q 3p & M' : [a(p + <?)] [a{p+q)]^_^. (2) 

Since q and q' are determined only by the top variables and p is determined 
by the bottom variables, it follows that among the 2"/^ subfunctions obtained 
by replacing the top variables with constants, there are at least |A^'| pairwise 
different ones. So, at level n/2, the tt-OBDD consists of at least \N'\ nodes. 
Further, a simple inductive argument shows that any OBDD contains in a level 
i at most one more node than there are nodes in all preceding levels 1, . . . , f — 1 
together. Therefore, the total number of nodes in the OBDD for MUL“_]^ „ is 
at least 2|iV'| -I- 1 (including the two sinks). 

Let e = 16/17 and k = nl2 — 6. Then by an easy calculation one obtains that 

|M| = \N\ = 2”/2 > 2(2'=+^-!)^-^. 

By Lemma 2, there exists an a for which /*(M) and fa{N) contain at least 
(2e — 1)2* = 15/17 • 2* elements each. We fix this a, define / = /* and continue 
to determine appropriate M' and N' . 

As an intermediate step, we choose M* and N* to be minimal subsets of M 
respectively N, such that f{M*) and f{N*) contain exactly 13/17 • 2*“^ even 
elements. Such sets exist, since at most 2*“^ of the 2* possible function values 
are odd, and thus at least 15/17 • 2* — 2*“^ = 13/17 • 2*“^ of the elements in M 
respectively N have distinct and even function values under /. Note that because 
we required M* and N* to be minimal, no two elements from M* respectively 
N* have the same function value under /. 

The following observation is crucial for the rest of the proof: For any p G M* 
and any q G N* , the fc-th bit of f{p) + f{q) has the same value as the n-th bit 
of a{p + q). Or formally 

[/(p) + /(9)]fc-i = [a(p +?)]„-! ■ (3) 
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The reason for this is that the rightmost bits of f{p) and f{q) are both zero (since 
these values are even) . Recalling that the division executed by / is in fact a right- 
shift by n — k bits, we obtain [ap]n-k = [aq]n-k = 0. Therefore, the bits of ap+aq 
with higher index than n — k are not influenced by a carry bit resulting from 
the addition of the less significant bits ([ap]„_fc . . . [ap]o) + {\aq\n-k ■ ■ ■ [a^Jo)- 
This means that f{p) + f{q) has in all bits (except possibly the least significant 
one) the same value as a{p + q) in the bits with indices n — 1, . . . ,n — k, and 
equation (3) is true. 

In order to satisfy property (2) it is sufficient by the above arguments that 
the sets M' and N' are subsets of M* and N* and that the following holds: 

V(j, q' eN',q^ q' 3p e M' : [f{p) + f{q)]k-i ^ [f{p) + f{q')]k-i- (4) 

We set M' = M* and 

N' = {qeN* \3peM' ■. f{q) = 2^= - f{p)}. (5) 

In order to prove claim (4), let q and q' be arbitrary distinct elements from N' . 
Since q and q' are in N* and therefore have distinct function values under /, we 
may assume w.l.o.g. that 

0 < (/(g) - /(gO) mod 2'= < 2'=-^ (6) 

(otherwise we achieve this by exchanging q and q'). By construction, there exists 
ape M' with f{p) + J{q) = 2*. For this p, obviously the fc-th bit of J{p) + f{q), 
that is [f{p) + f{q)]k-i, equals 0. But on the other hand, by inequation (6), the 
value of (/(p) -I- f{q')) mod 2* is in {2^“^, ... ,2^ — l}. This means that the fc-th 
bit of /(p) -I- f{q') equals 1, and thus claim (4) is proven. 

So far, we have constructed subsets M' C M and N' C N, which satisfy 
claim (2), implying by our arguments a lower bound on the tt-OBDD size of 
2|A^'| -I- 1. All that is left to do, is to give an appropriate lower bound on \N'\. 
Recall the definition of N' in (5), and that f{M') and f{N*) contain 13/17 • 
2 fc-i 0 ven elements each. Because for any even /(p) also 2* — /(p) is even, 
the set {2* — /(p) | p G M'} contains 13/17 -2^~^ even elements, too. But since 
there exist only 2*“^ even elements in {O, . . . , 2^}, the intersection of f{N*) and 
{2^ — /(p) I p G M'} - which is f{N') - has a cardinality of at least d ■ 2^~^, 
where d = 1 — 2(1 — 13/17) = 9/17. By the choice of k, 2\N'\ + 1 (and thus also 
the size of the tt-OBDD) is bounded below by 

9 9 , 2"/^ 

2\f(N')\ + l > 2 2^-^ + I = — .2”/2-6_^1 > + 1 □ 

|M qw _ 17 -r 11 , -r 121 -r 

This theorem shows the general result for MUL„_i,„ by the following straight- 
forward observation: If for some constant B and some variable ordering tt there 
exists an a, for which the tt-OBDD size of MUL“_j^ „ is at least B + 1, then 
the tt-OBDD size of MUL„_i,„ is at least 2B. This is, because in any OBDD 
computing MUL„_i,„(x, p) either the input x or the input y may be set to the 
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constant a. In both cases the resulting OBDD contains at at least B — 1 inner 
nodes, not counting those for variables fixed to constants (since they may be 
deleted without changing the function). So, by the last theorem the OBDD for 
MUL„_i^„ has a size of at least 2 • /121, which proves the main result 

(Theorem 1). 

Furthermore, by a straightforward reduction, one can easily obtain a lower 
bound on computing the other output bits of the multiplication. A simple proof 
(see [8], Corollary 1) shows that any representation computing MULfc_i^„ or 
MUL 2 n-fc-i,n may also compute MULfc_i^fc. 

Corollary 1. The size of an OBDD computing MULfc_i^„ or MUL 2 „_fc_i,„ is 
at least 2L*/^J /61. 

Note that our lower bound on MUL„_i^„ relies only on the existence of a 
constant a for each variable ordering tt, for which MUL“_]^ „ leads to a large 
7T-OBDD representation. If one would want to significantly improve this bound, 
this would have to be done by a different technique, taking more values for a 
into consideration. In other words, the result of Theorem 4 is optimal up to a 
small constant factor: 

Theorem 5. There exists a variable ordering tt which allows for any a € 
{0, . . . , 2" — 1} the construction of a tt — OBDD for MUL“_]^ „ having a size of 
3 • 2r”/^T . 

The proof will be sketched at the end of the next section. 



4 Upper Bounds 

In this section, we derive the upper bounds stated in Theorems 2 and 5. Both 
bounds can be proven by the same technique, which makes use of the fact that 
the minimal-size tt-OBDD for a Boolean function / is unique up to isomorphism 
[7], and of a theorem by Sieling and Wegener [19], describing the structure of 
the minimal-size tt-OBDD. 

Let / be a Boolean function and tt be an arbitrary variable ordering on 
For ai, . . . ,Ui G {0, 1} (1 < i < n), denote by fai,...,ai the subfunction of / that 
computes f{xi, . . . ,x„), where for 1 < j < i the j-th input-variable according 
to TT (that is x,r-i(j)) is fixed by the constant aj. More formally. 

Further, we say that a function g essentially depends on an input variable Xi, if 
9\xi=o 7 ^ 9\xi=i- 

Theorem 6 ([19]). The number of Xi-nodes of the minimal-size tt-OBDD for 
f is the number of different subfunctions /oi,...,ai_i for ai, . . . ,Ui-i G {0,1}, 
essentially depending on Xi. 
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In order to show the stated upper bounds, let x = (x„_i . . .xq) and y = 
{y„-i . . .yo) be the input variables for MUL„_i^„. Further, let Ti denote the 
family of subfunctions fx,y* of MUL„_i^„ that result from replacing the variables 
xo, . . . , x„_i and yo, ■ • ■ , Vi-i with constants. I.e., for y* := (yi_i . . . yo), 

fx,y*{Vn-i ■ ■ - Vi) = MUL„_i,„(x, y„_i . . .y^y*). 

Our goal is to bound the number of different subfunctions in Ti. We define 
for any subfunction fx,y* G Ti its index ind{fx,y*) to be the number represented 
by {zn-i ■ ■ ■ Zi), where z = x • y*. Consider arbitrary x and y = (yn-i ■■ - Vi y*)- 
By the school-method of multiplication we have 

x-y = X • y* -I- 2*x • (y„_i . . .yi). 

Since the second term of the sum is a value shifted by i bits to the left (and thus 
has its i least significant bits set to 0), the addition of xy* and 2*x • {yn-i ■ ■ - Vi) 
has no carry at position i. Hence, replacing x-y* by 2* • ind{fx,y*) in the above 
sum does not change the result for the output bits with indices i,. . . ,n — 1. 
Furthermore, writing 2*x • (yn-i ■ ■ - Vi) as 

n—1 

X! ■ (y™-i • • -y*), 

J=0 

implies that the bits Xj with j > n — i have no influence on the output bit with 
index n — 1. Thus, MUL„_i^„(x, y) is uniquely determined by ind{fx,y) and 
2*(x„_i_i . . . xo) • (yn-i • . ■ yi)- We summarize this result in the following claim: 

Claim 1. Each subfunction fx,y* G is uniquely determined by {xn-i-i ■ ■ -Xq) 
and its index ind{fx,y*)- □ 

We are now ready to prove the upper bounds. 

Proof (of Theorem 2). Let G be the minimal-size tt-OBDD, which reads first 
all x-bits and then the bits yo, • . • ,yn-i in this order. Further, let k = [n/3]. 
Denote the upper part of G to the subgraph, in which the x-variables and the 
variables yo, . • . ,yfc-i are read. Obviously, this part contains at most as many 
nodes as a balanced binary tree with n + k levels, thus has a size of at most 

2^n-\-k ^ 

We bound now the number of yj-nodes, for i > k. By Theorem 6, this is 
at most the number of different subfunctions fx,y- in Ei. But since there are 
only 2"“* different values for ind{fx,y*) and as many values for (xn-i-i ■ ■ ■ xq), 
it follows from Claim 1 that there are at most different subfunctions in 

Ei- So, the bottom part of G consists of at most inner nodes for each 

i G {/c,...,n — 1}. An easy calculation shows that both parts contain together 
at most 



n—1 

2«+fc _ 

i—k 






2^n-\-k 




2n-2k 



7 
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inner nodes. Since k = |"n/3], we may write n = 2>k — t for some 0 < r < 3. 
Thus, also counting the two sinks, the 7r-OBDD-size is bounded above by 

24fc— T _j_ ^ _ 2 ^^— 2r Z _j_ 2 < 2^/3 _j_ ^ _ 2~2t/ 3 j 

3 3 V 3 y 

By case distinction (r = 0, 1, 2) it can be easily verified that the factor in paren- 
thesis has a value of at most 7/3. Since further the exponent of the first factor 
(4fc — 4r/3) equals 4n/3, the proof is complete. □ 

The proof of Theorem 5 for MUL“_]^ „ uses almost the same line of argument, 
so that we only sketch the differences. The vector x of variables is replaced with 
the constant a, and the variables yo,. ■ ■ ,yn-i are read again in this order. But 
now, the upper part of the OBDD consists of the first n/2 variables of y, that is 
yo, ■ ■ ■ , ynj 2 -i) and its size is again bounded by that of a binary tree (2”/^ — !)• 
Using the index of the functions fa,y*, the number of different subfunctions 
in Ti is then bounded for n/2 < i < n — 1 similarly to the above proof. In 
this way, we conclude that the lower part of the OBDD consists of at most 
Yl'iZn /2 = 2(2"/^ — l) inner nodes, which shows the claim. 

Note that it is possible to specify the subfunctions fx,y- explicitly. This means 
that the above proofs do not only show the existence of OBDDs with the prop- 
erties stated in Theorems 2 and 5, but can in fact be used to construct them. 
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